weAutSys's parser functions provide a fast and dividable (by thread yield) way to handle command line and other text inputs.
Standard signature for parsing functions is
uint8_t parseResult(resultType * result, char * s, uint8_t whereParseStarts)
return is the index of the character in s not used for getting the result respectively next character index. In case of no success 0 is returned and result will be unchanged.
Note: It is considered to change that signature to
uint8_t parseResult(resultType * result, char * whereParseStarts)
return is the number of characters consumed for getting the result. In case of no success 0 is returned and result will be unchanged.
The change considered for future would allow some performance gain for the parsing, but slightly complicate the handling.
Defines | |
#define | INDEX_OFFSET_LIST2 125 |
Offset for token found in second list by searchTokenIn. | |
Functions | |
uint8_t | asDigit (uint8_t c) __attribute__((always_inline)) |
Recognise one digit. | |
char * | getFirstSVNtokenP (char *dest, char const *src, uint8_t mxLen) |
Copy the first SVN token or a full string from program space. | |
uint8_t | isContainedIn (char s[], const uint8_t c) |
Check if a character is contained in a RAM string. | |
uint8_t | isContainedIn_P (char s[], const uint8_t c) |
Check if a character is contained in a flash string. | |
uint8_t | parse2hex (uint8_t *res, char s[], uint8_t si) |
Parse a two digit hexadecimal number. | |
uint8_t | parseByteNum (uint8_t *res, char s[], uint8_t si) |
Parse a one byte number. | |
uint8_t | parseDate (date_t *dat, char s[], uint8_t si) |
Parse a date. | |
uint8_t | parseDWordNum (uint32_t *res, char s[], uint8_t si) |
Parse a four byte number. | |
uint8_t | parseIpAdd (uip_ipaddr_t ipAddr, char s[], uint8_t si) |
Parse an ipV4 Ethernet address. | |
uint8_t | parseMACadd (eth_addr_t *macAddr, char s[], uint8_t si) |
Parse a MAC address. | |
uint8_t | parseTim (datdur_t *tim, char s[], uint8_t si) |
Parse a (clock) time. | |
uint8_t | parseWordNum (uint16_t *res, char s[], uint8_t si) |
Parse a two byte number. | |
uint8_t | searchFirstToken (char s[], uint8_t si, uint8_t len) |
Search the first token in a short (RAM) string. | |
uint8_t | searchTokenEnd (char s[], uint8_t si, uint8_t len) |
Search the the end of a token in a short (RAM) string. | |
uint8_t | searchTokenIn (char s[], uint8_t si, uint8_t se, char const *const *list, char const *const *list2) |
Match a (short) token in a (RAM) string to a list of flash strings. | |
uint8_t | searchTokenStart (char s[], uint8_t si, uint8_t len) |
Search the next token in a short (RAM) string. |
uint8_t parseIpAdd | ( | uip_ipaddr_t | ipAddr, |
char | s[], | ||
uint8_t | si | ||
) |
Parse an ipV4 Ethernet address.
This function tries to interpret a part of the string s starting at si
as an IP address expected in the form "112.12.0.1" i.e. four dot separated numbers of up to three decimal digits each.
Leading blanks will be skipped.
ipAddr | pointer to the result (ipaddr_t is an array type); the result is changed only if an IP address could be parsed successfully |
s | the RAM string containing the digit(s) |
si | the first digit's index in s |
uint8_t parseMACadd | ( | eth_addr_t * | macAddr, |
char | s[], | ||
uint8_t | si | ||
) |
Parse a MAC address.
This function tries to interpret a part of the string s starting at si
as a MAC address expected in the form "11:2B:3C:4D:5E:6F" i.e. six colon separated hexadecimal numbers of (up to) two digits each.
Leading blanks will be skipped.
macAddr | pointer to the result (eth_addr_t is a structure containing an array type); the result is changed only if a MAC address could be parsed successfully |
s | the RAM string containing the digit(s) |
si | the first digit's index in s |
Parse a (clock) time.
This function tries to interpret a part of the string s
starting at si
as a (clock) time. The expected format is "hh:mm:ss" or "hh:mm" with the seconds implied as 0. Returned is the index of the first character after that (parsed) time in case of success and 0 otherwise.
Leading and trailing blanks at every segment will be skipped as will be (allowed) leading zeroes. Hence "17:3:9" and " 17 : 03 :09 " will get the same interpretation.
tim | pointer to the result structure; may be changed partially even if the parse was not successful |
s | the RAM string containing the time |
si | the first time's index in s |
Parse a date.
This function tries to interpret a part of the string s
starting at si
as a (clock) time. The expected format is "yyyy-mm-dd" or "yy-mm-dd" with two digit year interpreted as 2008 .. 2107 which is more than the recommended / allowed range of years.
Leading and trailing blanks at every segment will be skipped as will be (allowed) leading zeroes. Hence "12-18-1" and " 2012 - 18 - 01 " and even " 2012 - 18 - 0x00001 " will get the same interpretation.
This function does only rudimentary checks to the ranges in the date string ((20)08 .. 2169, 1..12, 1..31). If one check fails or the string can't be interpreted 0 is returned. Fields in \ dat may have been changed nevertheless. The field wd
(week day) in dat
is set to 0 (illegal / unknown) instead of the correct (1..7) value in case of failure.
dat | pointer to the result structure; may be changed partially even if the parse was not successful |
s | the RAM string containing the date |
si | the first index of the date in s |
uint8_t searchFirstToken | ( | char | s[], |
uint8_t | si, | ||
uint8_t | len | ||
) |
Search the first token in a short (RAM) string.
This function returns the start index (>=si) of the token found in string
s
. The begin of a token may be any non control code except space, comma, semicolon and equals ( ,;=).
255 is returned if no token was found before [len-1].
This function does almost the same as searchTokenStart() except that it skips over 0-codes.
The rationale for this 0-code skipping is that UART input may get zeros from PC emulated terminals when this PC enters or leaves energy saving mode. Using searchTokenStart() after those situations would swallow the next command line entry.
The consequence is, of course, that parameter len
must be correct. The too big len
for 0-terminated string idiom fails with this function.
s | the RAM string to search in |
si | the index to start search (usually 0 for first token) |
len | the maximum length of s to search in |
uint8_t searchTokenStart | ( | char | s[], |
uint8_t | si, | ||
uint8_t | len | ||
) |
Search the next token in a short (RAM) string.
This function returns the start index (>=si) of the token found in string
s
. The begin of a token may be any non control code except space, comma, semicolon and equals ( ,;=).
255 is returned if no token was found before [len-1] or a 0-code. This 0-code condition is the only difference to searchFirstToken().
s | the RAM string to search in |
si | the index to start search |
len | the maximum length of s to search in |
uint8_t searchTokenEnd | ( | char | s[], |
uint8_t | si, | ||
uint8_t | len | ||
) |
Search the the end of a token in a short (RAM) string.
This function returns the index of the first character following the the token starting at [si] in string
s
. It returns the len
if no end was found before. The return value is <=
len
.
Any control character, space, comma, semicolon and equals ( ,;=) is considered as end of a token.
This function's return value (if < len) is a good start for the next token search by
searchTokenStart(s, retval, len)
.
s | the RAM string to search in |
si | the (found) token's begin index |
len | the maximum length of s to search in |
uint8_t searchTokenIn | ( | char | s[], |
uint8_t | si, | ||
uint8_t | se, | ||
char const *const * | list, | ||
char const *const * | list2 | ||
) |
Match a (short) token in a (RAM) string to a list of flash strings.
This function returns the start index of the token's first match in a flash array list
of flash strings. The array list must end with a NULL, like in example
char* systemCommands[]) = {comHelp, comWDlong, NULL};
The other entries, comHelp and comWDlong in the example, are 0-terminated strings in flash memory (PROGMEM).
The same applies to the second list list2
. No list shall be longer than INDEX_OFFSET_LIST2.
The return value is the first match's index (range 0 .. llen-1) or 255 if no match could be found or is possible.
The match is made to the respective beginnings of the flash strings. A positive match will occur also if the flash string is longer. To avoid or recognise a hit to an ambiguous abbreviation a second search beyond the first hit is then made. 254 is returned if a second ambiguous hit is found.
The comparison is case insensitive. The token must not contain any control or white space characters.
s | the RAM string containing the token to search for |
si | the token's start index in s |
se | the token's end index + 1 in s |
list | the flash array of flash strings to match to (ending with NULL) |
list2 | the second flash array of flash strings to match to (ending with NULL) 120 is added to the return match index to distinguish the result from a match in list |
char* getFirstSVNtokenP | ( | char * | dest, |
char const * | src, | ||
uint8_t | mxLen | ||
) |
Copy the first SVN token or a full string from program space.
This function copies the first SVN-token — i.e. Subversion keyword's (first) value — from src
to dest
plus a trailing 0. Returned is the address of the last character in dest
modified (i.e. the 0). In the example
"$Dáte: 2012-12-19 14:13:57 +0100 (Mi, 19 Dez 2012) $"
the first SVN token would be "2012-12-19".
The operation will stop after mxLen
characters transferred resp. modified. The mxLen
count includes the 0 appended.
If src
does not begin with a $ src
will be transferred from start to first $ or white space (or end).
dest | the destination to modify (in RAM) |
src | the source (the SVN tag) to copy from (in flash memory) |
mxLen | the maximum number of characters to modify in dest including the trailing 0 appended. The maximum advance of the return value to parameter dest |
uint8_t asDigit | ( | uint8_t | c | ) |
Recognise one digit.
This helper function returns the digit value 0..9 for characters '0'..'9' and 10..15 for characters 'a'..'f' or 'A'..'F'. If the parameter c is in neither range 16 is returned. Hence result & 16 means invalid.
c | the character |
uint8_t parse2hex | ( | uint8_t * | res, |
char | s[], | ||
uint8_t | si | ||
) |
Parse a two digit hexadecimal number.
This function tries to interpret a part of the string s
starting at si
as one or two digit hexadecimal number. Returned is the index of the first character after that number in case of success and 0 otherwise.
Leading blanks will be skipped.
res | pointer to the result; changed only if a hexadecimal number was parsed successfully |
s | the RAM string containing the 2 digit hex (00..FF/ff) |
si | the first digit's index in s |
uint8_t parseByteNum | ( | uint8_t * | res, |
char | s[], | ||
uint8_t | si | ||
) |
Parse a one byte number.
This function tries to interpret a part of the string s
starting at si
as an eight bit (one byte) number. Returned is the index of the first character after that (parsed) number in case of success and 0 otherwise.
Leading blanks will be skipped. One sign character '+' or '-' is accepted. A minus will complement the result, notwithstanding it being declared unsigned or eventually parsed as hex. Leading zeroes don't, of course change, the result and they do not lead to octal interpretation.
A leading 0x or 0X will have the following one or two characters be interpreted as hexadecimal digits not counting leading zeros after the 'x'. The parsing then will stop at the latest after the second digit even if the character following (where the returned value points to) is a legal hex digit character.
res | pointer to the result; changed only if a one byte number (0..255 or 0x0 .. 0xFF) was parsed successfully |
s | the RAM string containing the digit(s) |
si | the first digit's index in s |
uint8_t parseWordNum | ( | uint16_t * | res, |
char | s[], | ||
uint8_t | si | ||
) |
Parse a two byte number.
This function tries to interpret a part of the string s
starting at si
as a 16 bit (type uint16_t) number. Returned is the index of the first character after that (parsed) number in case of success and 0 otherwise.
Leading blanks will be skipped. One sign character '+' or '-' is accepted. A minus will complement the result, notwithstanding it being declared unsigned or eventually parsed as hex. Leading zeroes don't, of course change, the result and they do not lead to octal interpretation.
A leading 0x or 0X will have the following one to four characters be interpreted as hexadecimal digits not counting leading zeros after the 'x'. The parsing then will stop at the latest after the forth digit even if the following character (where the returned value points to) is a legal hex digit character.
res | pointer to the result; changed only if a number was parsed successfully |
s | the RAM string containing the digit(s) |
si | the first digit's index in s |
uint8_t parseDWordNum | ( | uint32_t * | res, |
char | s[], | ||
uint8_t | si | ||
) |
Parse a four byte number.
This function tries to interpret a part of the string s
starting at si
as a 32 bit (type uint32_t) number. Returned is the index of the first character after that (parsed) number in case of success and 0 otherwise.
Leading blanks will be skipped. One sign character '+' or '-' is accepted. A minus will complement the result, notwithstanding it being declared unsigned or eventually parsed as hex. Leading zeroes don't, of course change, the result and they do not lead to octal interpretation.
A leading 0x or 0X will have the following one to eight characters be interpreted as hexadecimal digits not counting leading zeros after the 'x'. The parsing then will stop at the latest after the eighth digit even if the character following (where the returned value points to) is a legal hex digit character.
res | pointer to the result; changed only if a number was parsed successfully |
s | the RAM string containing the digit(s) |
si | the first digit's index in s |
uint8_t isContainedIn | ( | char | s[], |
const uint8_t | c | ||
) |
Check if a character is contained in a RAM string.
This function return true if the character c
is not 0 and contained in the 0-terminated string s
.
s | the (0-terminated) RAM string to be searched for c |
c | the (non 0) character looked for |
c
was not found in s
, c
: c
was found in s
uint8_t isContainedIn_P | ( | char | s[], |
const uint8_t | c | ||
) |
Check if a character is contained in a flash string.
This function return true if the character c
is not 0 and contained in the 0-terminated string s
. s
must be in flash memory.
s | the (0-terminated) string in flash / program memory to be searched for c |
c | the (non 0) character looked for |
c
was not found in s
, c
: c
was found in s