Skip to content
Snippets Groups Projects
Commit cf8e3809 authored by Hugo Hörnquist's avatar Hugo Hörnquist
Browse files

Major rewrites in Notation.hs

parent 33419135
No related branches found
No related tags found
No related merge requests found
...@@ -30,126 +30,73 @@ letter. ...@@ -30,126 +30,73 @@ letter.
both LysType and TypeField both LysType and TypeField
\end{itemize} \end{itemize}
} }
The \emph{only} possible (non-derived) types are
A LysType our notation for our different ``core'' types, \begin{code}
which all in some way hold a TypeField. data LysType = INT32 | INT16 | INT8 | BOOL | FLOAT | HOLLERITH
\begin{code} | BITSTRING [String]
data LysType = SimpleType TypeField | ENUMERATION [(String, Int)]
| BitstringType [TypeField] | ENUMERATION_OF String -- must be name of a selection
| EnumType [TypeField] | STRUCTURE [(String, LysType)]
| StructureType [TypeField] | SELECTION [(Int, String, (String, LysType))]
| SelectionType [TypeField] | ARRAY LysType
| ArrayType TypeField
deriving (Show) deriving (Show)
\end{code} \end{code}
A type field holds all the data we have about a type, This means that the parser for type signatures is simply:
including nested types. And is represented as:
\begin{code}
data TypeField = OnlyType String
| BitStringField String
| StructureField TypeBinding
| EnumField Int String
| SelectionField Int String TypeBinding
| ArrayField TypeField
deriving (Show)
\end{code}
\subsection{Parsing Types}
Types come in a few forms. But they all share similarities.
First out are the simple primitive types. These are INT32,
INT16, INT8, BOOL, FLOAT \& HOLLERITH. They all represent
a number of different (obvious) types, except for HOLLERITH
which is a string (see section \ref{item:holler}).
They can all be parsed with:
\begin{code} \begin{code}
typeWordParser :: GenParser Char () TypeField lystypeParser :: GenParser Char () LysType
typeWordParser = OnlyType <$> word lystypeParser =
(try (string "INT8") >> return INT8)
<|> (try (string "INT16") >> return INT16)
<|> (try (string "INT32") >> return INT32)
<|> (try (string "BOOL") >> return BOOL)
<|> (try (string "FLOAT") >> return FLOAT)
<|> (try (string "HOLLERITH") >> return HOLLERITH) -- see section \ref{item:holler}.
<|> (try arrayTypeParser >>= return . ARRAY)
<|> (try bitstringTypeParser >>= return . BITSTRING)
<|> (try selectionParser >>= return . SELECTION)
<|> (try enumTypeParser >>= return . ENUMERATION)
<|> (try enumOfTypeParser >>= return . ENUMERATION_OF)
<|> (try structTypeParser >>= return . STRUCTURE)
\end{code} \end{code}
\hr
\subsubsection{Arrays} \subsubsection{Arrays}
Arrays are a simple type which represents a list of one Arbitrary length lists of a single type. Specified by
type. In the specification they are written as
\begin{verbatim} \begin{verbatim}
ARRAY <type> ARRAY <type>
\end{verbatim} \end{verbatim}
Where <type> is which types they hold.
\begin{code} and parsed as
arrayParser :: GenParser Char () TypeField
arrayParser = ArrayField
<$> (string "ARRAY"
*> whitespaces
*> typeWordParser)
\end{code}
\hr
We then have the structure types BISTRINGS, ENUMERATIONS,
SELECTIONS, \& STRUCTURES.
They all have in common that the handle a list of
declarations, surrounded by parenthesis. Therefore we start
by creating a general listParser; Which takes a parser for each
field and returns a list of fields.
\begin{code} \begin{code}
listParser :: GenParser Char () TypeField -> GenParser Char () [TypeField] arrayTypeParser :: GenParser Char () LysType
listParser fieldParser arrayTypeParser = string "ARRAY" *> whitespaces *> lystypeParser
= withDelim' "()" (many $ withWS fieldParser)
\end{code} \end{code}
Many of them also have a name before the their list. This
parsers takes a parser for the actual list, and checks if
a set string appears before it.
\begin{code} \begin{code}
specialTypeParser listParser fieldParser = withDelim' "()" (many $ withWS fieldParser)
:: String
-> GenParser Char () [TypeField]
-> GenParser Char () [TypeField]
specialTypeParser str subParser =
string str *> whitespaces *> subParser
\end{code} \end{code}
\subsubsection{Bitstring} \subsubsection{Bitstring}
Bitstrings are the simplest type. They represent a number List of boolean. Specified as
of bits. A sample bistring structure in the BNF could look
like:
\begin{verbatim} \begin{verbatim}
BITSTRING ( name; BITSTRING ( name; ... )
other-name;
)
\end{verbatim} \end{verbatim}
where each name is a descriptive string of that bit.
Where both `name' and `other-name' declare one bit filed
each, and each tell what that field should contain. It also
implies that this specific bitstring holds exactly two bits,
since it has two fields.
\begin{code} \begin{code}
bitstringFieldParser :: GenParser Char () TypeField bitstringTypeParser :: GenParser Char () [String]
bitstringFieldParser = BitStringField <$> word bitstringTypeParser = string "BITSTRING"
<* whitespaces *> whitespaces
<* char ';' *> listParser (word <* whitespaces <* char ';')
bitstringParser = specialTypeParser "BITSTRING" (listParser bitstringFieldParser)
\end{code} \end{code}
\subsubsection{Selections} \subsubsection{Selections}
\TODO{understand Selections} Tagged unions.
If I understand correctly selections are a form of type unions.
\footnote{Please correct me}
They declare a <name> <n> mapping which is used for
specifing which type <type> will be used. I don't know why
the <tail> field extists, but it creates a secound name.
\begin{verbatim} \begin{verbatim}
selection ( selection (
...@@ -157,22 +104,39 @@ selection ( ...@@ -157,22 +104,39 @@ selection (
) )
\end{verbatim} \end{verbatim}
\begin{quote}
Protocol A: Simple Data Types
given
\begin{verbatim}
description ::= SELECTION (
1=name the_name : HOLLERITH;
2=age years : INT32;
)
\end{verbatim}
[the] two legal messages of the type `description' are `1 4HJohn' and `2 18'.
\end{quote}
<name> and <tail> names only for the reader (of the protocol).
\begin{code} \begin{code}
selectionFieldParser :: GenParser Char () TypeField selectionFieldParser :: GenParser Char () (Int, String, (String, LysType))
selectionFieldParser selectionFieldParser
= SelectionField = (,,)
<$> (intParser <* char '=') <$> (intParser <* char '=') <*> (withWS word)
<*> (withWS word) <*> structFieldParser <* whitespaces <* char ';'
<*> (bindingParser <* whitespaces <* char ';')
selectionParser = specialTypeParser "SELECTION" (listParser selectionFieldParser) selectionParser = string "SELECTION" *> listParser selectionFieldParser
\end{code} \end{code}
\subsubsection{Enemurations} \subsubsection{Enemurations}
An enumeration works just as expected. It declares a number Named subset of integer, equivalent to C. Can either be declared
of symbols, as well as integer representations for them all. directly through the ENUMERATION statement\footnote{
there actually aren't any ENUMERATION statements in protocol A
}, or derived from a
selection through ENUMERATION-OF.
An example BNF of it would be: The ``regular'' case looks like
\begin{verbatim} \begin{verbatim}
ENUMERATION ( name = 1; ENUMERATION ( name = 1;
other = 2; other = 2;
...@@ -180,96 +144,49 @@ ENUMERATION ( name = 1; ...@@ -180,96 +144,49 @@ ENUMERATION ( name = 1;
\end{verbatim} \end{verbatim}
\begin{code} \begin{code}
enumFieldParser :: GenParser Char () TypeField enumFieldParser :: GenParser Char () (String, Int)
enumFieldParser = flip EnumField enumFieldParser = do
<$> word w <- word
<*> withDelim' "=;" intParser string "="
i <- intParser
whitespaces
string ";"
return (w, i)
-- (,) <$> word <* char '=' *> intParser <* whitespaces <* char ';'
enumTypeParser = listParser enumFieldParser
\end{code} \end{code}
They can also appear on the form Second from is
\begin{verbatim} \begin{verbatim}
ENUMERATION-OF (<selection-type>) ENUMERATION-OF (<selection-type>)
\end{verbatim} \end{verbatim}
Which builds an enumeration from the <n> and <name> field in Which builds an enumeration from the <n> and <name> field in
a selection. See above. a selection.
The parser for the BNF here would be
\footnote{The parser might work. But I can't figure out the types for it.}
\begin{code}
-- enumSelectionParser :: GenParser Char () TypeField
-- enumSelectionParser = SelectEnumField
-- <$> word "ENUMERATION-OF"
-- *> withDelim "()" ( typeWordParser
-- <|> selectionParser
-- <?> "Selection")
\end{code}
\begin{code}
enumParser = specialTypeParser "ENUMERATION" (listParser enumFieldParser)
-- <|> enumSelectionParser
-- <?> "BNF Enum declaration"
\end{code}
\TODO{move the following somewhere else.}
On evaluating the inner selection is expanded, and bound
translated to an enum with.
\begin{code} \begin{code}
-- makeEnum :: LysType -> LysType enumOfTypeParser = string "ENUMERATOIN-OF" *> withDelim "()" word
-- makeEnum (SelectionType []) = []
-- makeEnum (SelectionType (s:xs))
-- = EnumField n name : makeEnum xs
-- where (SelectionField n name _ _) = s
\end{code} \end{code}
\subsubsection{Structures} \subsubsection{Structures}
A sturcture is just a simple compound data type, on the form A sturcture is just a simple compound data type, on the form
\begin{verbatim} \begin{verbatim}
( field-name : TYPE; ( field-name : TYPE; ... )
other-name : TYPE;
)
\end{verbatim} \end{verbatim}
Note here that with my implementation the last semicolon is % Note here that with my implementation the last semicolon is
optional. This is to better work with how the BNF specifies % optional. This is to better work with how the BNF specifies
RPC requests (see section \ref{item:rpc}) % RPC requests (see section \ref{item:rpc})
\begin{code}
structFieldParser :: GenParser Char () TypeField
structFieldParser = StructureField
<$> bindingParser
<* whitespaces
<* ( char ';'
<|> lookAhead (char ')')
<?> "Struct Field End")
\end{code}
The reason for the \lstinline{lookahead (char ')')} in the
above code is since we want to check for the structure
ending, but don't consume it. Since the actual parsing of
the surrounding parenthisis is done in the ``listParser''.
\begin{code} \begin{code}
structParser = listParser structFieldParser structFieldParser :: GenParser Char () (String, LysType)
\end{code} structFieldParser = (,)
<$> (word <* withWS (string ":"))
<*> lystypeParser <* whitespaces <* char ';'
\hr structTypeParser :: GenParser Char () [(String, LysType)]
structTypeParser = listParser structFieldParser
We can now create a general type parser, which juts binds
all our above parsers into one. It also shows that the
reason for LysType existing besides TypeField was so that we
didn't have to wory about a type being single or multiple.
\begin{code}
typeParser :: GenParser Char () LysType
typeParser = (BitstringType <$> try bitstringParser)
<|> ( EnumType <$> try enumParser)
<|> (SelectionType <$> try selectionParser)
<|> ( ArrayType <$> try arrayParser)
<|> (StructureType <$> try structParser)
<|> ( SimpleType <$> typeWordParser)
<?> "LysType"
\end{code} \end{code}
\subsection{Bindings} \subsection{Bindings}
...@@ -295,16 +212,20 @@ bindingParser :: GenParser Char () TypeBinding ...@@ -295,16 +212,20 @@ bindingParser :: GenParser Char () TypeBinding
bindingParser bindingParser
= TypeBinding = TypeBinding
<$> (word <* withWS (try (string "::=") <$> (word <* withWS (try (string "::=")
<|> string ":" <?> "::= expected"))
<?> "Name Type Separator")) <*> lystypeParser
<*> typeParser
\end{code} \end{code}
\begin{verbatim}
$ parseTest bindingParser "a ::= ARRAY INT32"
TypeBinding "a" (ArrayType (ArrayField (OnlyType "INT32")))
\end{verbatim}
\subsection{Extra Helpers} \subsection{Extra Helpers}
\TODO{These are here, they should maybe be moved.} \TODO{These are here, they should maybe be moved.}
\begin{code} \begin{code}
maybeTypeParser = option Nothing $ Just <$> typeParser maybeTypeParser = option Nothing $ Just <$> lystypeParser
maybeBindingParser = option Nothing $ Just <$> bindingParser maybeBindingParser = option Nothing $ Just <$> bindingParser
\end{code} \end{code}
...@@ -51,7 +51,7 @@ The most up to date version of this document, along with its source code, can be ...@@ -51,7 +51,7 @@ The most up to date version of this document, along with its source code, can be
found at \mbox{\url{https://git.lysator.liu.se/hugo/hskom}}. found at \mbox{\url{https://git.lysator.liu.se/hugo/hskom}}.
\chapter {From BNF to AST} \chapter {Parsing protocol-a.txt}
\label{cha:bnfast} \label{cha:bnfast}
What we had from the outset was an info page detailing Protocol A, as well as a What we had from the outset was an info page detailing Protocol A, as well as a
...@@ -69,7 +69,7 @@ chapter. ...@@ -69,7 +69,7 @@ chapter.
\section{Types \& Bindings} \section{Types \& Bindings}
\input{lhs/Notation.lhs} \input{lhs/Notation.lhs}
\section{RPC \& Async} \section{RPC \& Async Declarations}
\label{item:rpc} \label{item:rpc}
\input{lhs/RPC.lhs} \input{lhs/RPC.lhs}
...@@ -81,7 +81,7 @@ chapter. ...@@ -81,7 +81,7 @@ chapter.
\input{lhs/AstHaskell.lhs} \input{lhs/AstHaskell.lhs}
\chapter {From Incomming Message to Haskell} \chapter {Parsing Line-data}
\label{cha:incomming} \label{cha:incomming}
Everything so far has simple been about parsing a BNF file, and generating Everything so far has simple been about parsing a BNF file, and generating
Haskell code from it. Now we start actually looking towards actual data! Haskell code from it. Now we start actually looking towards actual data!
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment