SimpleTalk Grammar
Here is a formal grammar for SimpleTalk (so far) in EBNF.
- Braces {...} represent the Kleene star: zero-or-more repetition.
- Brackets [...] represent an optional (zero-or-one) construct.
- Vertical bars ...|... represent mutually exclusive alternatives.
(And here is a rationale for preferring EBNF over a CFG.)
Commands
So far, all commands start with a unique keyword. This makes top-down parsing easy to implement. Eventually handler names will also serve as commands.
- Handler ➝
on name [ name { , name } ]
CommandSeq
end name
names
- CommandSeq ➝ { Command }
- Command ➝
AskCommand |
BreakCommand |
ExitCommand |
GetCommand |
IfCommand |
InPlaceOpCommand |
JoinCommand |
MoveCommand |
NextCommand |
PutCommand |
RepeatCommand |
SetCommand |
SplitCommand |
UndefineCommand
- AskCommand ➝ ask Expression [ with prompt Expression ]
- BreakCommand ➝ break
- ExitCommand ➝ exit repeat
- GetCommand ➝ get Expression
- IfCommand ➝
if Expression then OneLineCommand |
if Expression [ then ]
CommandSeq
{ else if Expression [ then ]
CommandSeq }
[ else
CommandSeq ]
end if
- InPlaceOpCommand ➝
add Expression to Changeable |
subtract Expression from Changeable |
multiply Changeable by Expression |
divide Changeable by Expression |
div Changeable by Expression |
mod Changeable by Expression
- JoinCommand ➝ join Expression [ with Expression ] into Changeable
- MoveCommand ➝ move Expression to name
- NextCommand ➝ next repeat
- PutCommand ➝ put Expression [ ( into | before | after ) Changeable ]
- RepeatCommand ➝
repeat ( forever | duration | count | bounds | enum )
CommandSeq
end repeat
- forever ➝ forever
- duration ➝ until Expression | while Expression
- count ➝ for Expression times
- bounds ➝ with name = Expression to Expression [ step Expression ]
- enum ➝ with name in Expression
- SetCommand ➝ set name of Expression to Expression
- SplitCommand ➝ split Expression [ by Expression ] into Changeable
- UndefineCommand ➝ undefine Expression
Remarks
Note that newlines are mandatory after each command, and inside commands as formatted above. For example, the get command should be:
GetCommand ➝ get Expression newline
These are omitted in the foregoing for convenience. Indenting is not required; we are not Python.
Expressions
The expression hierarchy illustrates the precedence, or binding order, of operations. Note that there is (generally) no direct recursion in each production. For example, the rule for E1 does not have E1 on its right-hand side. (Exceptions include E10.) The hierarchy indirectly recurs in the last production, when an E12 can be a parenthesized Expression.
- Expression ➝ E1 { ( & | && ) E1 }
- E1 ➝ E2 { ~ E2 }
- E2 ➝ E3 { or E3 }
- E3 ➝ E4 { and E4 }
- E4 ➝ E5 [ ( = | <> | ≠ ) E5 ]
- E5 ➝ E6 [ ( is | isnt ) ValueProperty ]
- E6 ➝ E7 [ ( < | > | <= | ≤ | >= | ≥ ) E7 ]
- E7 ➝ E8 { ( + | − ) E8 }
- E8 ➝ E9 { ( * | × | / | ÷ | mod | div ) E9 }
- E9 ➝ [ − | not ] E10
- E10a ➝ TextSelector Expression of E10a | E10b
- E10b ➝ IndexSelector Expression of E10b | E11
- E11 ➝ the name [ of E11 ] | E12
- E12 ➝ number | string | boolean |
List | Dict | name | ( Expression )
- List ➝ “[” [ Expression { , Expression } ] “]”
- Dict ➝ “{” [ KeyValue { , KeyValue } ] “}”
- KeyValue ➝ Expression : Expression
- TextSelector ➝ char | word | line
- IndexSelector ➝ item | entry
- ValueProperty ➝ Adjective | a TypeName
- Adjective ➝ “empty” | “nonempty” | “defined” | “undefined” |
“negative” |“positive” |“finite” | “infinite”
- TypeName ➝ “Bool” | “Dict” | “List” | “Number” | “String”
- Changeable ➝ Selector Expression of Changeable | name
Remarks
In some cases we want to have words serve dual purpose, being keywords in some productions and var names in others. (Aka contextual keywords.)
For example, the Parser has some (inelegant) special-casing to allow the word ‘a’ to be used both as an indefinite article (as in ValueProperty ➝ a number) and as a variable name (as in E12). It would be simpler to just declare a to be a reserved word, but scripters will (reasonably) expect to be able to use it for a variable.
Similarly, it will reasonable to write
repeat with line in the lines of text
end repeat
get line 2 of text
put line 2 of text into line
and expect it to just work. (Although the third example is perhaps poor style.) The same situation arises with any of the Selector words “item”, “entry”, “line”, “word”, and “char”: we want to use them as keywords in E10, and var names in RepeatCommand blocks.