A programming language is a type of project which evaluates a program written in a programming language. Complexity of programming languages vary a lot. Some are integrated into operating system projects.

Often, a user is required to write the program in a text editor like TextEdit or NotePad, then save it as a *.txt file. It is then imported into a list in Scratch. This simplifies editing, as the Scratch List Editor is not very simple to use. Some languages allow programmers to "distribute" code on a forum topic by letting end-users install code by copying and pasting, or in a studio by downloading and importing projects. This allows for virtual devices which users can install multiple applications on.

Implementation

A programming language implementation consists of a lexer, a parser and an evaluator. It often also has a code optimizer.

Lexer

The lexer separates the program into substrings that represent different grammatical components, called "lexemes," which are classified as different "tokens." Tokens may be defined using regular expressions.

Parser

The parser is a finite-state machine that receives the tokens as input and produces a parse tree. Two types of parsing exist: top-down parsing, which starts at the root and ends at the leaves, and bottom-up parsing, which starts at the leaves and ends at the root. Different parsing algorithms exist, such as LR(1), SLR(1), LALR(1), and LL(1). After parsing, the program is represented as a parse tree.

Grammar

The tokens are parsed according to a "context-free grammar." A context-free grammar consists of:

  • a set of tokens
  • a set of nonterminals
  • set of productions
  • a start symbol

Nonterminals are symbols and productions are their definitions. Productions consist of tokens and nonterminals. An example of a production is EXPRESSION;, which may one of many definitions of the nonterminal STATEMENT.

An example of a grammar is:
STATEMENT_LIST : STATEMENT_LIST STATEMENT
| STATEMENT

STATEMENT : EXPRESSION ;
| { STATEMENT_LIST }
| if ( EXPRESSION ) STATEMENT
| if ( EXPRESSION ) STATEMENT else STATEMENT
| while ( EXPRESSION ) STATEMENT

EXPRESSION : ( EXPRESSION )
| TERM

TERM : TERM + FACTOR
| TERM - FACTOR
| FACTOR

FACTOR : FACTOR * number
| FACTOR / number
| number

number : [0-9]*(\.[0-9]*)?

LR parsing

A LR parser is a type of bottom-up parser. The state machine of a LR parser records a stack of symbols. Based on the current input, it can either shift a token onto the stack or reduce the stack by popping off a recognized production and replacing it with the nonterminal it defines. For example, the tokens ["1", "+", "19", ";"] could result in these actions:

Action Stack
shift "1" [1]
reduce FACTOR : number [FACTOR]
reduce TERM : FACTOR [TERM]
shift "+" [TERM, +]
shift "19" [TERM, +, 19]
reduce FACTOR : number [TERM, +, FACTOR]
reduce TERM : TERM + FACTOR [TERM]
reduce EXPRESSION : TERM [EXPRESSION]
shift ";" [EXPRESSION, ;]
reduce STATEMENT : EXPRESSION ; [STATEMENT]
reduce STATEMENT_LIST : STATEMENT [STATEMENT_LIST]

Interpretation

The program may be executed directly from the tree, or it may be translated into an intermediate representation. High-level code is defined by low-level instructions. For example, the code:
1 + 2 * 5

could be represented as these commands:
push 1
push 2
push 5
integerMultiply
integerAdd

Common Features

There are many common features in programming languages as they are easy to set up in Scratch. These include:

Jump/Label for flow control

Jump/Label is common in many low-level languages like BASIC, Batch, etc. It is based on "labeling" lines, then "jumping" the compiler to those lines based on commands. For example:

Label loop
Print "Hi!"
Jump loop

Here, the compiler will first label line 1 as "loop", then print "Hi!", then go back to "loop", where it will continue again (i.e. print "Hi!"). Thus, it is an infinite loop. Complex flows can be achieved with conditional jumping, where a jump happens only if a condition is fulfilled.

Arguments in methods

There is a wide range of techniques to support arguments in methods, but the most efficient is to include each argument on a new line ("line-by-line programming"). This minimizes parsing as it is easy to extract an item of a list rather than a part of a string.

Named Variables

The more advanced languages provide names variables. Names variables operate on two lists. One list contains the name of the variable and the other, the value. The program can retrieve a variable by searching a list until the position of the variable is found in the names list, then getting the value from the values list.

Kinds of Interactions

Interactions are the methods by which a program interacts with the user/programmer. There are usually two types of interactions: Terminal/Console, and IDE/Driver.

Terminal Interaction: CUI environment

Usually, a programming language will be run by a makeshift console consisting of a list display and an ask textbox. Commands like -h and -r are used to get help and run, respectively. Here, the programming language often provides a print function which adds an item to the list, thus "printing" it. An example is QuickSilver.

IDE and Driver Interaction: GUI environment

Another way to run some programs is the IDE and Driver Interaction method. You can create, edit, test, and debug programs in an IDE which is designed to help you write better code. Then you can save your program and others can install it on their own personal VM (driver software) by importing a project which contains the code. Often the VM acts like a simple OS allowing multiple "Apps" to be added and run all from a single database. IDE and Driver programming languages often do not require a console or terminal to be run, they use vector graphics for their output via the Pen. An example of this is Skip.

Examples

See Also