erayerdin@programming.dev to

Rust@programming.dev · 4 months ago

How do I parse the escape characters of the content of a string literal input with nom?

3

8

How do I parse the escape characters of the content of a string literal input with nom?

erayerdin@programming.dev to

Rust@programming.dev · 4 months ago

3

So, I’m basically trying to parse a string literal with nom. This is the code I’ve come up with:

use nom::{
    bytes::complete::{tag, take_until},
    sequence::delimited,
    IResult,
};

/// Parses string literals.
fn parse_literal<'a>(input: &'a str) -> IResult<&'a str, &'a str> {
    // escape tag identifier is the same as delimiter, obviously
    let escape_tag_identifier =
        input
            .chars()
            .nth(0)
            .ok_or(nom::Err::Error(nom::error::Error::new(
                input,
                nom::error::ErrorKind::Verify,
            )))?;

    let (remaining, value) = delimited(
        tag(escape_tag_identifier.to_string().as_str()),
        take_until(match escape_tag_identifier {
            '\'' => "'",
            '"' => "\"",
            _ => unreachable!("parse_literal>>take_until branched into unreachable."),
        }),
        tag(escape_tag_identifier.to_string().as_str()),
    )(input)?;

    Ok((remaining, value))
}

#[cfg(test)]
mod literal_tests {
    use super::*;

    #[rstest]
    #[case(r#""foo""#, "foo")]
    #[case(r#""foo bar""#, "foo bar")]
    #[case(r#""foo \" bar""#, r#"foo " bar"#)]
    fn test_dquotes(#[case] input: &str, #[case] expected_output: &str) {
        let result = parse_literal(input);
        assert_eq!(result, Ok(("", expected_output)));
    }

    #[rstest]
    #[case("'foo'", "foo")]
    #[case("'foo bar'", "foo bar")]
    #[case(r#"'foo \' bar'"#, "foo ' bar")]
    fn test_squotes(#[case] input: &str, #[case] expected_output: &str) {
        let result = parse_literal(input);
        assert_eq!(result, Ok(("", expected_output)));
    }

    #[rstest]
    #[case(r#""foo'"#, "foo'")]
    #[case(r#"'foo""#, r#"foo""#)]
    fn test_errs(#[case] input: &str, #[case] expected_err_input: &str) {
        let result = parse_literal(input);
        assert_eq!(
            result,
            Err(nom::Err::Error(nom::error::Error::new(
                expected_err_input,
                nom::error::ErrorKind::TakeUntil
            ))),
        );
    }
}

Note: The example uses rstest for tests.

Although it looks a little bit complex, actually, it is not. Basically, the parse function is parse_literal. The tests are separated for double quotes and single quotes and errors.,

When you run the tests, you will realize first and second cases for single and double quotes run successfully. The problem is with the third case of each: #[case(r#""foo \" bar""#, r#"foo " bar"#)] for test_dquotes and #[case(r#"'foo \' bar'"#, "foo ' bar")] for test_squotes.

Ideally, if a string literal is defined with single quotes and has single quotes in its content, the single quotes can be escaped with single quotes again. Same goes for double quotes as well. To demonstrate in a pseudocode:

"foo ' bar" // is ok
"foo \" bar" // is ok
"foo " bar" // is err
'foo " bar' // is ok
'foo \' bar' // is ok
'foo ' bar' // is err

Currently, in the code, I take characters until the delimiter with take_until, which reaches to the end of the input, which, let’s say, in this case, is guaranteed to contain only and only the string literal as input. So it’s kind of okay for first and second cases in the tests.

But, of course, this fails in the third cases of each test since the input has the delimiter character early on, finishes early and returns the remaining.

This is only for research purposes, so you do not need to give a fully-featured answer. A pathway is, as well, appreciated.

Thanks in advance.

Chat

oliveoilcheff@programming.dev
link
fedilink
English
arrow-up
4·
4 months ago
Have you tried the escaped function? https://docs.rs/nom/latest/nom/bytes/complete/fn.escaped.html
- erayerdin@programming.devOP
  link
  fedilink
  arrow-up
  1·
  edit-2
  4 months ago
  Hmm, didn’t see that. Lemme play with that a little. Maybe I can come up with something. Thank you.

Rust@programming.dev

rust@programming.dev

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !rust@programming.dev

Welcome to the Rust community! This is a place to discuss about the Rust programming language.

Wormhole

!performance@programming.dev

Credits

The icon is a modified version of the official rust logo (changing the colors to a gradient and black background)

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

35 users / day
51 users / week
544 users / month
1.84K users / 6 months
24 local subscribers
5.97K subscribers
878 Posts
3.86K Comments
Modlog