Haskell 2010
Language Report
Simon Marlow
(editor)
Copyright notice.
The authors and publisher intend this Report to belong to the entire Haskell community, and grant permission to copy and distribute it for any purpose, provided that it is reproduced in its entirety, including this Notice. Modified versions of this Report may also be copied and distributed for any purpose, provided that the modified version is clearly presented as such, and that it does not claim to be a definition of the language Haskell 2010.
“Some half dozen persons have written technically on combinatory logic, and most of these, including ourselves, have published something erroneous. Since some of our fellow sinners are among the most careful and competent logicians on the contemporary scene, we regard this as evidence that the subject is refractory. Thus fullness of exposition is necessary for accuracy; and excessive condensation would be false economy here, even more than it is ordinarily.”
Haskell B. Curry and Robert Feys
in the Preface to Combinatory Logic [3], May 31, 1956
In September of 1987 a meeting was held at the conference on Functional Programming Languages and Computer Architecture (FPCA ’87) in Portland, Oregon, to discuss an unfortunate situation in the functional programming community: there had come into being more than a dozen non-strict, purely functional programming languages, all similar in expressive power and semantic underpinnings. There was a strong consensus at this meeting that more widespread use of this class of functional languages was being hampered by the lack of a common language. It was decided that a committee should be formed to design such a language, providing faster communication of new ideas, a stable foundation for real applications development, and a vehicle through which others would be encouraged to use functional languages. This document describes the result of that (and subsequent) committee’s efforts: a purely functional programming language called Haskell, named after the logician Haskell B. Curry whose work provides the logical basis for much of ours.
The committee’s primary goal was to design a language that satisfied these constraints:
The committee intended that Haskell would serve as a basis for future research in language design, and hoped that extensions or variants of the language would appear, incorporating experimental features.
Haskell has indeed evolved continuously since its original publication. By the middle of 1997, there had been five versions of the language design (from Haskell 1.0 – 1.4). At the 1997 Haskell Workshop in Amsterdam, it was decided that a stable variant of Haskell was needed; this became “Haskell 98” and was published in February 1999. The fixing of minor bugs led to the Revised Haskell 98 Report in 2002.
At the 2005 Haskell Workshop, the consensus was that so many extensions to the official language were widely used (and supported by multiple implementations), that it was worthwhile to define another iteration of the language standard, essentially to codify (and legitimise) the status quo.
The Haskell Prime effort was thus conceived as a relatively conservative extension of Haskell 98, taking on board new features only where they were well understood and widely agreed upon. It too was intended to be a “stable” language, yet reflecting the considerable progress in research on language design in recent years.
After several years exploring the design space, it was decided that a single monolithic revision of the language was too large a task, and the best way to make progress was to evolve the language in small incremental steps, each revision integrating only a small number of well-understood extensions and changes. Haskell 2010 is the first revision to be created in this way, and new revisions are expected once per year.
The most significant language changes in Haskell 2010 relative to Haskell 98 are listed here.
New language features:
Removed language features:
The Haskell web site http://haskell.org gives access to many useful resources, including:
You are welcome to comment on, suggest improvements to, and criticise the language or its presentation in the report, via the Haskell mailing list, or the Haskell wiki.
Haskell was created, and continues to be sustained, by an active community of researchers and application programmers. Those who served on the Language and Library committees, in particular, devoted a huge amount of time and energy to the language. Here they are, with their affiliation(s) for the relevant period:
Arvind (MIT)
Lennart Augustsson (Chalmers University)
Dave Barton (Mitre Corp)
Brian Boutel (Victoria University of Wellington)
Warren Burton (Simon Fraser University)
Manuel M T Chakravarty (University of New South Wales)
Duncan Coutts (Well-Typed LLP)
Jon Fairbairn (University of Cambridge)
Joseph Fasel (Los Alamos National Laboratory)
John Goerzen
Andy Gordon (University of Cambridge)
Maria Guzman (Yale University)
Kevin Hammond [editor] (University of Glasgow)
Bastiaan Heeren (Utrecht University)
Ralf Hinze (University of Bonn)
Paul Hudak [editor] (Yale University)
John Hughes [editor] (University of Glasgow; Chalmers University)
Thomas Johnsson (Chalmers University)
Isaac Jones (Galois, inc.)
Mark Jones (Yale University, University of Nottingham, Oregon Graduate Institute)
Dick Kieburtz (Oregon Graduate Institute)
John Launchbury (University of Glasgow; Oregon Graduate Institute; Galois, inc.)
Andres Löh (Utrecht University)
Ian Lynagh (Well-Typed LLP)
Simon Marlow [editor] (Microsoft Research)
John Meacham
Erik Meijer (Utrecht University)
Ravi Nanavati (Bluespec, inc.)
Rishiyur Nikhil (MIT)
Henrik Nilsson (University of Nottingham)
Ross Paterson (City University, London)
John Peterson [editor] (Yale University)
Simon Peyton Jones [editor] (University of Glasgow; Microsoft Research Ltd)
Mike Reeve (Imperial College)
Alastair Reid (University of Glasgow)
Colin Runciman (University of York)
Don Stewart (Galois, inc.)
Martin Sulzmann (Informatik Consulting Systems AG)
Audrey Tang
Simon Thompson (University of Kent)
Philip Wadler [editor] (University of Glasgow)
Malcolm Wallace (University of York)
Stephanie Weirich (University of Pennsylvania)
David Wise (Indiana University)
Jonathan Young (Yale University)
Those marked [editor] served as the co-ordinating editor for one or more revisions of the language.
In addition, dozens of other people made helpful contributions, some small but many substantial. They are as follows: Hans Aberg, Kris Aerts, Sten Anderson, Richard Bird, Tom Blenko, Stephen Blott, Duke Briscoe, Paul Callaghan, Magnus Carlsson, Mark Carroll, Franklin Chen, Olaf Chitil, Chris Clack, Guy Cousineau, Tony Davie, Craig Dickson, Chris Dornan, Laura Dutton, Chris Fasel, Pat Fasel, Sigbjorn Finne, Michael Fryers, Peter Gammie, Andy Gill, Mike Gunter, Cordy Hall, Mark Hall, Thomas Hallgren, Matt Harden, Klemens Hemm, Fergus Henderson, Dean Herington, Bob Hiromoto, Nic Holt, Ian Holyer, Randy Hudson, Alexander Jacobson, Patrik Jansson, Robert Jeschofnik, Orjan Johansen, Simon B. Jones, Stef Joosten, Mike Joy, Wolfram Kahl, Stefan Kahrs, Antti-Juhani Kaijanaho, Jerzy Karczmarczuk, Kent Karlsson, Martin D. Kealey, Richard Kelsey, Siau-Cheng Khoo, Amir Kishon, Feliks Kluzniak, Jan Kort, Marcin Kowalczyk, Jose Labra, Jeff Lewis, Mark Lillibridge, Bjorn Lisper, Sandra Loosemore, Pablo Lopez, Olaf Lubeck, Christian Maeder, Ketil Malde, Michael Marte, Jim Mattson, John Meacham, Sergey Mechveliani, Gary Memovich, Randy Michelsen, Rick Mohr, Andy Moran, Graeme Moss, Arthur Norman, Nick North, Chris Okasaki, Bjarte M. Østvold, Paul Otto, Sven Panne, Dave Parrott, Larne Pekowsky, Rinus Plasmeijer, Ian Poole, Stephen Price, John Robson, Andreas Rossberg, George Russell, Patrick Sansom, Michael Schneider, Felix Schroeter, Julian Seward, Nimish Shah, Christian Sievers, Libor Skarvada, Jan Skibinski, Lauren Smith, Raman Sundaresh, Josef Svenningsson, Ken Takusagawa, Wolfgang Thaller, Satish Thatte, Tom Thomson, Tommy Thorn, Dylan Thurston, Mike Thyer, Mark Tullsen, David Tweed, Pradeep Varma, Keith Wansbrough, Tony Warnock, Michael Webber, Carl Witty, Stuart Wray, and Bonnie Yantis.
Finally, aside from the important foundational work laid by Church, Rosser, Curry, and others on the lambda calculus, it is right to acknowledge the influence of many noteworthy programming languages developed over the years. Although it is difficult to pinpoint the origin of many ideas, the following languages were particularly influential: Lisp (and its modern-day incarnations Common Lisp and Scheme); Landin’s ISWIM; APL; Backus’s FP [1]; ML and Standard ML; Hope and Hope+; Clean; Id; Gofer; Sisal; and Turner’s series of languages culminating in Miranda1 . Without these forerunners Haskell would not have been possible.
Simon Marlow
Cambridge, April 2010
Haskell is a general purpose, purely functional programming language incorporating many recent innovations in programming language design. Haskell provides higher-order functions, non-strict semantics, static polymorphic typing, user-defined algebraic datatypes, pattern-matching, list comprehensions, a module system, a monadic I/O system, and a rich set of primitive datatypes, including lists, arrays, arbitrary and fixed precision integers, and floating-point numbers. Haskell is both the culmination and solidification of many years of research on non-strict functional languages.
This report defines the syntax for Haskell programs and an informal abstract semantics for the meaning of such programs. We leave as implementation dependent the ways in which Haskell programs are to be manipulated, interpreted, compiled, etc. This includes such issues as the nature of programming environments and the error messages returned for undefined programs (i.e. programs that formally evaluate to ⊥).
In this section, we describe the abstract syntactic and semantic structure of Haskell, as well as how it relates to the organization of the rest of the report.
This report proceeds bottom-up with respect to Haskell’s syntactic structure.
The chapters not mentioned above are Chapter 6, which describes the standard built-in datatypes and classes in Haskell, and Chapter 7, which discusses the I/O facility in Haskell (i.e. how Haskell programs communicate with the outside world). Also, there are several chapters describing the Prelude, the concrete syntax, literate programming, the specification of derived instances, and pragmas supported by most Haskell compilers.
Examples of Haskell program fragments in running text are given in typewriter font:
“Holes” in program fragments representing arbitrary pieces of Haskell code are written in italics, as in if e1 then e2 else e3. Generally the italicized names are mnemonic, such as e for expressions, d for declarations, t for types, etc.
Haskell has adopted many of the convenient syntactic structures that have become popular in functional programming. In this Report, the meaning of such syntactic sugar is given by translation into simpler constructs. If these translations are applied exhaustively, the result is a program written in a small subset of Haskell that we call the Haskell kernel.
Although the kernel is not formally specified, it is essentially a slightly sugared variant of the lambda calculus with a straightforward denotational semantics. The translation of each syntactic structure into the kernel is given as the syntax is introduced. This modular design facilitates reasoning about Haskell programs and provides useful guidelines for implementors of the language.
An expression evaluates to a value and has a static type. Values and types are not mixed in Haskell. However, the type system allows user-defined datatypes of various sorts, and permits not only parametric polymorphism (using a traditional Hindley-Milner type structure) but also ad hoc polymorphism, or overloading (using type classes).
Errors in Haskell are semantically equivalent to ⊥ (“bottom”). Technically, they are indistinguishable from nontermination, so the language includes no mechanism for detecting or acting upon errors. However, implementations will probably try to provide useful information about errors. See Section 3.1.
There are six kinds of names in Haskell: those for variables and constructors denote values; those for type variables, type constructors, and type classes refer to entities related to the type system; and module names refer to modules. There are two constraints on naming:
These are the only constraints; for example, Int may simultaneously be the name of a module, class, and constructor within a single scope.
In this chapter, we describe the low-level lexical structure of Haskell. Most of the details may be skipped in a first reading of the report.
These notational conventions are used for presenting syntax:
[pattern] | optional |
{pattern} | zero or more repetitions |
(pattern) | grouping |
pat1 | pat2 | choice |
pat⟨pat′⟩ | difference—elements generated by pat |
except those generated by pat′ |
|
fibonacci | terminal syntax in typewriter font |
Because the syntax in this section describes lexical syntax, all whitespace is expressed explicitly; there is no implicit space between juxtaposed symbols. BNF-like syntax is used throughout, with productions having the form:
nonterm | → | alt1 | alt2 | … | altn |
Care must be taken in distinguishing metalogical syntax such as | and […] from concrete terminal syntax (given in typewriter font) such as | and [...], although usually the context makes the distinction clear.
Haskell uses the Unicode [2] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell.
This syntax depends on properties of the Unicode characters as defined by the Unicode consortium. Haskell compilers are expected to make use of new versions of Unicode as they are made available.
program | → | { lexeme | whitespace } |
lexeme | → | qvarid | qconid | qvarsym | qconsym |
| | literal | special | reservedop | reservedid | |
literal | → | integer | float | char | string |
special | → | ( | ) | , | ; | [ | ] | ` | { | } |
whitespace | → | whitestuff {whitestuff} |
whitestuff | → | whitechar | comment | ncomment |
whitechar | → | newline | vertab | space | tab | uniWhite |
newline | → | return linefeed | return | linefeed | formfeed |
return | → | a carriage return |
linefeed | → | a line feed |
vertab | → | a vertical tab |
formfeed | → | a form feed |
space | → | a space |
tab | → | a horizontal tab |
uniWhite | → | any Unicode character defined as whitespace |
comment | → | dashes [ any⟨symbol⟩ {any} ] newline |
dashes | → | -- {-} |
opencom | → | {- |
closecom | → | -} |
ncomment | → | opencom ANY seq {ncomment ANY seq} closecom |
ANY seq | → | {ANY }⟨{ANY } ( opencom | closecom ) {ANY }⟩ |
ANY | → | graphic | whitechar |
any | → | graphic | space | tab |
graphic | → | small | large | symbol | digit | special | " | ' |
small | → | ascSmall | uniSmall | _ |
ascSmall | → | a | b | … | z |
uniSmall | → | any Unicode lowercase letter |
large | → | ascLarge | uniLarge |
ascLarge | → | A | B | … | Z |
uniLarge | → | any uppercase or titlecase Unicode letter |
symbol | → | ascSymbol | uniSymbol⟨special | _ | " | '⟩ |
ascSymbol | → | ! | # | $ | % | & | ⋆ | + | . | / | < | = | > | ? | @ |
| | \ | ^ | | | - | ~ | : | |
uniSymbol | → | any Unicode symbol or punctuation |
digit | → | ascDigit | uniDigit |
ascDigit | → | 0 | 1 | … | 9 |
uniDigit | → | any Unicode decimal digit |
octit | → | 0 | 1 | … | 7 |
hexit | → | digit | A | … | F | a | … | f |
Lexical analysis should use the “maximal munch” rule: at each point, the longest possible lexeme satisfying the lexeme production is read. So, although case is a reserved word, cases is not. Similarly, although = is reserved, == and ~= are not.
Any kind of whitespace is also a proper delimiter for lexemes.
Characters not in the category ANY are not valid in Haskell programs and should result in a lexing error.
Comments are valid whitespace.
An ordinary comment begins with a sequence of two or more consecutive dashes (e.g. --) and extends to the following newline. The sequence of dashes must not form part of a legal lexeme. For example, “-->” or “|--” do not begin a comment, because both of these are legal lexemes; however “--foo” does start a comment.
A nested comment begins with “{-” and ends with “-}”. No legal lexeme starts with “{-”; hence, for example, “{---” starts a nested comment despite the trailing dashes.
The comment itself is not lexically analysed. Instead, the first unmatched occurrence of the string “-}” terminates the nested comment. Nested comments may be nested to any depth: any occurrence of the string “{-” within the nested comment starts a new nested comment, terminated by “-}”. Within a nested comment, each “{-” is matched by a corresponding occurrence of “-}”.
In an ordinary comment, the character sequences “{-” and “-}” have no special significance, and, in a nested comment, a sequence of dashes has no special significance.
Nested comments are also used for compiler pragmas, as explained in Chapter 12.
If some code is commented out using a nested comment, then any occurrence of {- or -} within a string or within an end-of-line comment in that code will interfere with the nested comments.
varid | → | (small {small | large | digit | ' })⟨reservedid⟩ |
conid | → | large {small | large | digit | ' } |
reservedid | → | case | class | data | default | deriving | do | else |
| | foreign | if | import | in | infix | infixl | |
| | infixr | instance | let | module | newtype | of | |
| | then | type | where | _ |
An identifier consists of a letter followed by zero or more letters, digits, underscores, and single quotes. Identifiers are lexically distinguished into two namespaces (Section 1.4): those that begin with a lowercase letter (variable identifiers) and those that begin with an upper-case letter (constructor identifiers). Identifiers are case sensitive: name, naMe, and Name are three distinct identifiers (the first two are variable identifiers, the last is a constructor identifier).
Underscore, “_”, is treated as a lowercase letter, and can occur wherever a lowercase letter can. However, “_” all by itself is a reserved identifier, used as wild card in patterns. Compilers that offer warnings for unused identifiers are encouraged to suppress such warnings for identifiers beginning with underscore. This allows programmers to use “_foo” for a parameter that they expect to be unused.
varsym | → | ( symbol⟨:⟩ {symbol} )⟨reservedop | dashes⟩ |
consym | → | ( : {symbol})⟨reservedop⟩ |
reservedop | → | .. | : | :: | = | \ | | | <- | -> | @ | ~ | => |
Operator symbols are formed from one or more symbol characters, as defined above, and are lexically distinguished into two namespaces (Section 1.4):
Notice that a colon by itself, “:”, is reserved solely for use as the Haskell list constructor; this makes its treatment uniform with other parts of list syntax, such as “[]” and “[a,b]”.
Other than the special syntax for prefix negation, all operators are infix, although each infix operator can be used in a section to yield partially applied operators (see Section 3.5). All of the standard infix operators are just predefined symbols and may be rebound.
In the remainder of the report six different kinds of names will be used:
varid | (variables) | ||
conid | (constructors) | ||
tyvar | → | varid | (type variables) |
tycon | → | conid | (type constructors) |
tycls | → | conid | (type classes) |
modid | → | {conid .} conid | (modules) |
Variables and type variables are represented by identifiers beginning with small letters, and the others by identifiers beginning with capitals; also, variables and constructors have infix forms, the other four do not. Module names are a dot-separated sequence of conids. Namespaces are also discussed in Section 1.4.
A name may optionally be qualified in certain circumstances by prepending them with a module identifier. This applies to variable, constructor, type constructor and type class names, but not type variables or module names. Qualified names are discussed in detail in Chapter 5.
qvarid | → | [modid .] varid |
qconid | → | [modid .] conid |
qtycon | → | [modid .] tycon |
qtycls | → | [modid .] tycls |
qvarsym | → | [modid .] varsym |
qconsym | → | [modid .] consym |
Since a qualified name is a lexeme, no spaces are allowed between the qualifier and the name. Sample lexical analyses are shown below.
This | Lexes as this |
f.g | f . g (three tokens) |
F.g | F.g (qualified ‘g’) |
f.. | f .. (two tokens) |
F.. | F.. (qualified ‘.’) |
F. | F . (two tokens) |
decimal | → | digit{digit} |
octal | → | octit{octit} |
hexadecimal | → | hexit{hexit} |
integer | → | decimal |
| | 0o octal | 0O octal | |
| | 0x hexadecimal | 0X hexadecimal | |
float | → | decimal . decimal [exponent] |
| | decimal exponent | |
exponent | → | (e | E) [+ | -] decimal |
There are two distinct kinds of numeric literals: integer and floating. Integer literals may be given in decimal (the default), octal (prefixed by 0o or 0O) or hexadecimal notation (prefixed by 0x or 0X). Floating literals are always decimal. A floating literal must contain digits both before and after the decimal point; this ensures that a decimal point cannot be mistaken for another use of the dot character. Negative numeric literals are discussed in Section 3.4. The typing of numeric literals is discussed in Section 6.4.1.
char | → | ' (graphic⟨' | \⟩ | space | escape⟨\&⟩) ' |
string | → | " {graphic⟨" | \⟩ | space | escape | gap} " |
escape | → | \ ( charesc | ascii | decimal | o octal | x hexadecimal ) |
charesc | → | a | b | f | n | r | t | v | \ | " | ' | & |
ascii | → | ^cntrl | NUL | SOH | STX | ETX | EOT | ENQ | ACK |
| | BEL | BS | HT | LF | VT | FF | CR | SO | SI | DLE | |
| | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | |
| | EM | SUB | ESC | FS | GS | RS | US | SP | DEL | |
cntrl | → | ascLarge | @ | [ | \ | ] | ^ | _ |
gap | → | \ whitechar {whitechar} \ |
Character literals are written between single quotes, as in 'a', and strings between double quotes, as in "Hello".
Escape codes may be used in characters and strings to represent special characters. Note that a single quote ' may be used in a string, but must be escaped in a character; similarly, a double quote " may be used in a character, but must be escaped in a string. \ must always be escaped. The category charesc also includes portable representations for the characters “alert” (\a), “backspace” (\b), “form feed” (\f), “new line” (\n), “carriage return” (\r), “horizontal tab” (\t), and “vertical tab” (\v).
Escape characters for the Unicode character set, including control characters such as \^X, are also provided. Numeric escapes such as \137 are used to designate the character with decimal representation 137; octal (e.g. \o137) and hexadecimal (e.g. \x37) representations are also allowed.
Consistent with the “maximal munch” rule, numeric escape characters in strings consist of all consecutive digits and may be of arbitrary length. Similarly, the one ambiguous ASCII escape code, "\SOH", is parsed as a string of length 1. The escape character \& is provided as a “null character” to allow strings such as "\137\&9" and "\SO\&H" to be constructed (both of length two). Thus "\&" is equivalent to "" and the character '\&' is disallowed. Further equivalences of characters are defined in Section 6.1.2.
A string may include a “gap”—two backslants enclosing white characters—which is ignored. This allows one to write long strings on more than one line by writing a backslant at the end of one line and at the start of the next. For example,
String literals are actually abbreviations for lists of characters (see Section 3.7).
Haskell permits the omission of the braces and semicolons used in several grammar productions, by using layout to convey the same information. This allows both layout-sensitive and layout-insensitive styles of coding, which can be freely mixed within one program. Because layout is not required, Haskell programs can be straightforwardly produced by other programs.
The effect of layout on the meaning of a Haskell program can be completely specified by adding braces and semicolons in places determined by the layout. The meaning of this augmented program is now layout insensitive.
Informally stated, the braces and semicolons are inserted as follows. The layout (or “off-side”) rule takes effect whenever the open brace is omitted after the keyword where, let, do, or of. When this happens, the indentation of the next lexeme (whether or not on a new line) is remembered and the omitted open brace is inserted (the whitespace preceding the lexeme may include comments). For each subsequent line, if it contains only whitespace or is indented more, then the previous item is continued (nothing is inserted); if it is indented the same amount, then a new item begins (a semicolon is inserted); and if it is indented less, then the layout list ends (a close brace is inserted). If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list “{}” is inserted, and layout processing occurs for the current level (i.e. insert a semicolon or close brace). A close brace is also inserted whenever the syntactic category containing the layout list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted. The layout rule matches only those open braces that it has inserted; an explicit open brace must be matched by an explicit close brace. Within these explicit open braces, no layout processing is performed for constructs outside the braces, even if a line is indented to the left of an earlier implicit open brace.
Section 10.3 gives a more precise definition of the layout rules.
Given these rules, a single newline may actually terminate several layout lists. Also, these rules permit:
making a, b and g all part of the same layout list.
As an example, Figure 2.1 shows a (somewhat contrived) module and Figure 2.2 shows the result of applying the layout rule to it. Note in particular: (a) the line beginning }};pop, where the termination of the previous line invokes three applications of the layout rule, corresponding to the depth (3) of the nested where clauses, (b) the close braces in the where clause nested within the tuple and case expression, inserted because the end of the tuple was detected, and (c) the close brace at the very end, inserted because of the column 0 indentation of the end-of-file token.
In this chapter, we describe the syntax and informal semantics of Haskell expressions, including their translations into the Haskell kernel, where appropriate. Except in the case of let expressions, these translations preserve both the static and dynamic semantics. Free variables and constructors used in these translations always refer to entities defined by the Prelude. For example, “concatMap” used in the translation of list comprehensions (Section 3.11) means the concatMap defined by the Prelude, regardless of whether or not the identifier “concatMap” is in scope where the list comprehension is used, and (if it is in scope) what it is bound to.
exp | → | infixexp :: [context =>] type | (expression type signature) |
| | infixexp | ||
infixexp | → | lexp qop infixexp | (infix operator application) |
| | - infixexp | (prefix negation) | |
| | lexp | ||
lexp | → | \ apat1 … apatn -> exp | (lambda abstraction, n ≥ 1) |
| | let decls in exp | (let expression) | |
| | if exp [;] then exp [;] else exp | (conditional) | |
| | case exp of { alts } | (case expression) | |
| | do { stmts } | (do expression) | |
| | fexp | ||
fexp | → | [fexp] aexp | (function application) |
aexp | → | qvar | (variable) |
| | gcon | (general constructor) | |
| | literal | ||
| | ( exp ) | (parenthesized expression) | |
| | ( exp1 , … , expk ) | (tuple, k ≥ 2) | |
| | [ exp1 , … , expk ] | (list, k ≥ 1) | |
| | [ exp1 [, exp2] .. [exp3] ] | (arithmetic sequence) | |
| | [ exp | qual1 , … , qualn ] | (list comprehension, n ≥ 1) | |
| | ( infixexp qop ) | (left section) | |
| | ( qop⟨-⟩ infixexp ) | (right section) | |
| | qcon { fbind1 , … , fbindn } | (labeled construction, n ≥ 0) | |
| | aexp⟨qcon⟩ { fbind1 , … , fbindn } | (labeled update, n ≥ 1) | |
Expressions involving infix operators are disambiguated by the operator’s fixity (see Section 4.4.2). Consecutive unparenthesized operators with the same precedence must both be either left or right associative to avoid a syntax error. Given an unparenthesized expression “x qop(a,i) y qop(b,j) z” (where qop(a,i) means an operator with associativity a and precedence i), parentheses must be added around either “xqop(a,i) y” or “y qop(b,j) z” when i = j unless a = b = l or a = b = r.
An example algorithm for resolving expressions involving infix operators is given in Section 10.6.
Negation is the only prefix operator in Haskell; it has the same precedence as the infix - operator defined in the Prelude (see Section 4.4.2, Figure 4.1).
The grammar is ambiguous regarding the extent of lambda abstractions, let expressions, and conditionals. The ambiguity is resolved by the meta-rule that each of these constructs extends as far to the right as possible.
Sample parses are shown below.
This | Parses as |
f x + g y | (f x) + (g y) |
- f x + y | (- (f x)) + y |
let { ... } in x + y | let { ... } in (x + y) |
z + let { ... } in x + y | z + (let { ... } in (x + y)) |
f x y :: Int | (f x y) :: Int |
\ x -> a+b :: Int | \ x -> ((a+b) :: Int) |
For the sake of clarity, the rest of this section will assume that expressions involving infix operators have been resolved according to the fixities of the operators.
Errors during expression evaluation, denoted by ⊥ (“bottom”), are indistinguishable by a Haskell program from non-termination. Since Haskell is a non-strict language, all Haskell types include ⊥. That is, a value of any type may be bound to a computation that, when demanded, results in an error. When evaluated, errors cause immediate program termination and cannot be caught by the user. The Prelude provides two functions to directly cause such errors:
A call to error terminates execution of the program and returns an appropriate error indication to the operating system. It should also display the string in some system-dependent manner. When undefined is used, the error message is created by the compiler.
Translations of Haskell expressions use error and undefined to explicitly indicate where execution time errors may occur. The actual program behavior when an error occurs is up to the implementation. The messages passed to the error function in these translations are only suggestions; implementations may choose to display more or less information when an error occurs.
aexp | → | qvar | (variable) |
| | gcon | (general constructor) | |
| | literal |
gcon | → | () | |
| | [] | ||
| | (,{,}) | ||
| | qcon | ||
var | → | varid | ( varsym ) | (variable) |
qvar | → | qvarid | ( qvarsym ) | (qualified variable) |
con | → | conid | ( consym ) | (constructor) |
qcon | → | qconid | ( gconsym ) | (qualified constructor) |
varop | → | varsym | ` varid ` | (variable operator) |
qvarop | → | qvarsym | ` qvarid ` | (qualified variable operator) |
conop | → | consym | ` conid ` | (constructor operator) |
qconop | → | gconsym | ` qconid ` | (qualified constructor operator) |
op | → | varop | conop | (operator) |
qop | → | qvarop | qconop | (qualified operator) |
gconsym | → | : | qconsym |
Haskell provides special syntax to support infix notation. An operator is a function that can be applied using infix syntax (Section 3.4), or partially applied using a section (Section 3.5).
An operator is either an operator symbol, such as + or $$, or is an ordinary identifier enclosed in grave accents (backquotes), such as ` op `. For example, instead of writing the prefix application op x y, one can write the infix application x` op ` y. If no fixity declaration is given for ` op ` then it defaults to highest precedence and left associativity (see Section 4.4.2).
Dually, an operator symbol can be converted to an ordinary identifier by enclosing it in parentheses. For example, (+) x y is equivalent to x + y, and foldr (⋆) 1 xs is equivalent to foldr (\x y -> x⋆y) 1 xs.
Special syntax is used to name some constructors for some of the built-in types, as found in the production for gcon and literal. These are described in Section 6.1.
An integer literal represents the application of the function fromInteger to the appropriate value of type Integer. Similarly, a floating point literal stands for an application of fromRational to a value of type Rational (that is, Ratio Integer).
Translation: The integer literal i is equivalent to fromInteger i, where fromInteger is a method in class Num (see Section 6.4.1).
The floating point literal f is equivalent to fromRational (n Ratio.% d), where fromRational is a method in class Fractional and Ratio.% constructs a rational from two integers, as defined in the Ratio library. The integers n and d are chosen so that n∕d = f.
fexp | → | [fexp] aexp | (function application) |
lexp | → | \ apat1 … apatn -> exp | (lambda abstraction, n ≥ 1) |
Function application is written e1 e2. Application associates to the left, so the parentheses may be omitted in (f x) y. Because e1 could be a data constructor, partial applications of data constructors are allowed.
Lambda abstractions are written \ p1 … pn -> e, where the pi are patterns. An expression such as \x:xs->x is syntactically incorrect; it may legally be written as \(x:xs)->x.
The set of patterns must be linear—no variable may appear more than once in the set.
Given this translation combined with the semantics of case expressions and pattern matching described in Section 3.17.3, if the pattern fails to match, then the result is ⊥.
infixexp | → | lexp qop infixexp | |
| | - infixexp | (prefix negation) | |
| | lexp | ||
qop | → | qvarop | qconop | (qualified operator) |
The form e1 qop e2 is the infix application of binary operator qop to expressions e1 and e2.
The special form -e denotes prefix negation, the only prefix operator in Haskell, and is syntax for negate (e). The binary - operator does not necessarily refer to the definition of - in the Prelude; it may be rebound by the module system. However, unary - will always refer to the negate function defined in the Prelude. There is no link between the local meaning of the - operator and unary negation.
Prefix negation has the same precedence as the infix operator - defined in the Prelude (see Table 4.1). Because e1-e2 parses as an infix application of the binary operator -, one must write e1(-e2) for the alternative parsing. Similarly, (-) is syntax for (\ x y -> x-y), as with any infix operator, and does not denote (\ x -> -x)—one must use negate for that.
aexp | → | ( infixexp qop ) | (left section) |
| | ( qop⟨-⟩ infixexp ) | (right section) |
Sections are written as ( op e ) or ( e op ), where op is a binary operator and e is an expression. Sections are a convenient syntax for partial application of binary operators.
Syntactic precedence rules apply to sections as follows. (op e) is legal if and only if (x op e) parses in the same way as (x op (e)); and similarly for (e op). For example, (⋆a+b) is syntactically invalid, but (+a⋆b) and (⋆(a+b)) are valid. Because (+) is left associative, (a+b+) is syntactically correct, but (+a+b) is not; the latter may legally be written as (+(a+b)). As another example, the expression
is invalid because, by the let/lambda meta-rule (Section 3), the expression
parses as
rather than
Because - is treated specially in the grammar, (- exp) is not a section, but an application of prefix negation, as described in the preceding section. However, there is a subtract function defined in the Prelude such that (subtract exp) is equivalent to the disallowed section. The expression (+ (- exp)) can serve the same purpose.
lexp | → | if exp [;] then exp [;] else exp |
A conditional expression has the form if e1 then e2 else e3 and returns the value of e2 if the value of e1 is True, e3 if e1 is False, and ⊥ otherwise.
Translation: The following identity holds:
if e1 then e2 else e3 | = | case e1 of { True -> e2 ; False -> e3 } |
where True and False are the two nullary constructors from the type Bool, as defined in the Prelude. The type of e1 must be Bool; e2 and e3 must have the same type, which is also the type of the entire conditional expression.
infixexp | → | exp1 qop exp2 | |
aexp | → | [ exp1 , … , expk ] | (k ≥ 1) |
| | gcon | ||
gcon | → | [] | |
| | qcon | ||
qcon | → | ( gconsym ) | |
qop | → | qconop | |
qconop | → | gconsym | |
gconsym | → | : |
Lists are written [e1, …, ek], where k ≥ 1. The list constructor is :, and the empty list is denoted []. Standard operations on lists are given in the Prelude (see Section 6.1.3, and Chapter 9 notably Section 9.1).
Translation: The following identity holds:
[e1, …, ek] | = | e1 : (e2 : ( … (ek : []))) |
where : and [] are constructors for lists, as defined in the Prelude (see Section 6.1.3). The types of e1 through ek must all be the same (call it t), and the type of the overall expression is [t] (see Section 4.1.2).
The constructor “:” is reserved solely for list construction; like [], it is considered part of the language syntax, and cannot be hidden or redefined. It is a right-associative operator, with precedence level 5 (Section 4.4.2).
aexp | → | ( exp1 , … , expk ) | (k ≥ 2) |
| | qcon | ||
qcon | → | (,{,}) | |
Tuples are written (e1, …, ek), and may be of arbitrary length k ≥ 2. The constructor for an n-tuple is denoted by (,…,), where there are n − 1 commas. Thus (a,b,c) and (,,) a b c denote the same value. Standard operations on tuples are given in the Prelude (see Section 6.1.4 and Chapter 9).
Translation: (e1, …, ek) for k ≥ 2 is an instance of a k-tuple as defined in the Prelude, and requires no translation. If t1 through tk are the types of e1 through ek, respectively, then the type of the resulting tuple is (t1, …, tk) (see Section 4.1.2).
aexp | → | gcon |
| | ( exp ) | |
gcon | → | () |
The form (e) is simply a parenthesized expression, and is equivalent to e. The unit expression () has type () (see Section 4.1.2). It is the only member of that type apart from ⊥, and can be thought of as the “nullary tuple” (see Section 6.1.5).
aexp | → | [ exp1 [, exp2] .. [exp3] ] |
The arithmetic sequence [e1, e2 .. e3] denotes a list of values of type t, where each of the ei has type t, and t is an instance of class Enum.
Translation: Arithmetic sequences satisfy these identities:
[ e1.. ] | = | enumFrom e1 |
[ e1,e2.. ] | = | enumFromThen e1 e2 |
[ e1..e3 ] | = | enumFromTo e1 e3 |
[ e1,e2..e3 ] | = | enumFromThenTo e1 e2 e3 |
where enumFrom, enumFromThen, enumFromTo, and enumFromThenTo are class methods in the class Enum as defined in the Prelude (see Figure 6.1).
The semantics of arithmetic sequences therefore depends entirely on the instance declaration for the type t. See Section 6.3.4 for more details of which Prelude types are in Enum and their semantics.
aexp | → | [ exp | qual1 , … , qualn ] | (list comprehension, n ≥ 1) |
qual | → | pat <- exp | (generator) |
| | let decls | (local declaration) | |
| | exp | (boolean guard) |
A list comprehension has the form [ e | q1, …, qn ],n ≥ 1, where the qi qualifiers are either
Such a list comprehension returns the list of elements produced by evaluating e in the successive environments created by the nested, depth-first evaluation of the generators in the qualifier list. Binding of variables occurs according to the normal pattern matching rules (see Section 3.17), and if a match fails then that element of the list is simply skipped over. Thus:
yields the list [4,2]. If a qualifier is a boolean guard, it must evaluate to True for the previous pattern match to succeed. As usual, bindings in list comprehensions can shadow those in outer scopes; for example:
[ x | x <- x, x <- x ] | = | [ z | y <- x, z <- y] |
Translation: List comprehensions satisfy these identities, which may be used as a translation into the kernel:
[ e | True ] | = | [e] |
[ e | q ] | = | [ e | q, True ] |
[ e | b, Q ] | = | if b then [ e | Q ] else [] |
[ e | p <- l, Q ] | = | let ok p = [ e | Q ] |
ok _ = [] | ||
in concatMap ok l | ||
[ e | let decls, Q ] | = | let decls in [ e | Q ] |
where e ranges over expressions, p over patterns, l over list-valued expressions, b over boolean expressions, decls over declaration lists, q over qualifiers, and Q over sequences of qualifiers. ok is a fresh variable. The function concatMap, and boolean value True, are defined in the Prelude.
As indicated by the translation of list comprehensions, variables bound by let have fully polymorphic types while those defined by <- are lambda bound and are thus monomorphic (see Section 4.5.4).
lexp | → | let decls in exp |
Let expressions have the general form let { d1 ; … ; dn } in e, and introduce a nested, lexically-scoped, mutually-recursive list of declarations (let is often called letrec in other languages). The scope of the declarations is the expression e and the right hand side of the declarations. Declarations are described in Chapter 4. Pattern bindings are matched lazily; an implicit ~ makes these patterns irrefutable. For example,
let (x,y) = undefined in e |
does not cause an execution-time error until x or y is evaluated.
Translation: The dynamic semantics of the expression let { d1 ; … ; dn } in e0 are captured by this translation: After removing all type signatures, each declaration di is translated into an equation of the form pi = ei, where pi and ei are patterns and expressions respectively, using the translation in Section 4.4.3. Once done, these identities hold, which may be used as a translation into the kernel:
let {p1=e1; ... ; pn=en} in e0 | = | let (~p1, ... ,~pn) = (e1, ... ,en) in e0 |
let p = e1 in e0 | = | case e1 of ~p -> e0 |
where no variable in p appears free in e1 | ||
let p = e1 in e0 | = | let p = fix ( \ ~p -> e1) in e0 |
where fix is the least fixpoint operator. Note the use of the irrefutable patterns ~p. This translation does not preserve the static semantics because the use of case precludes a fully polymorphic typing of the bound variables. The static semantics of the bindings in a let expression are described in Section 4.4.3.
lexp | → | case exp of { alts } | |
alts | → | alt1 ; … ; altn | (n ≥ 1) |
alt | → | pat -> exp [where decls] | |
| | pat gdpat [where decls] | ||
| | (empty alternative) | ||
gdpat | → | guards -> exp [ gdpat ] | |
guards | → | | guard1, …, guardn | (n ≥ 1) |
guard | → | pat <- infixexp | (pattern guard) |
| | let decls | (local declaration) | |
| | infixexp | (boolean guard) |
A case expression has the general form case e of { p1 match1 ; … ; pn matchn } where each matchi is of the general form
| gsi1 | -> ei1 |
|
… |
||
| gsimi | -> eimi |
|
where declsi
|
A guard has one of the following forms:
An alternative of the form pat -> exp where decls is treated as shorthand for:
pat | True | -> exp |
|
where decls
|
A case expression must have at least one alternative and each alternative must have at least one body. Each body must have the same type, and the type of the whole expression is that type.
A case expression is evaluated by pattern matching the expression e against the individual alternatives. The alternatives are tried sequentially, from top to bottom. If e matches the pattern of an alternative, then the guarded expressions for that alternative are tried sequentially from top to bottom in the environment of the case expression extended first by the bindings created during the matching of the pattern, and then by the declsi in the where clause associated with that alternative.
For each guarded expression, the comma-separated guards are tried sequentially from left to right. If all of them succeed, then the corresponding expression is evaluated in the environment extended with the bindings introduced by the guards. That is, the bindings that are introduced by a guard (either by using a let clause or a pattern guard) are in scope in the following guards and the corresponding expression. If any of the guards fail, then this guarded expression fails and the next guarded expression is tried.
If none of the guarded expressions for a given alternative succeed, then matching continues with the next alternative. If no alternative succeeds, then the result is ⊥. Pattern matching is described in Section 3.17, with the formal semantics of case expressions in Section 3.17.3.
A note about parsing. The expression
is tricky to parse correctly. It has a single unambiguous parse, namely
However, the phrase Bool -> a is syntactically valid as a type, and parsers with limited lookahead may incorrectly commit to this choice, and hence reject the program. Programmers are advised, therefore, to avoid guards that end with a type signature — indeed that is why a guard contains an infixexp not an exp.
lexp | → | do { stmts } | (do expression) |
stmts | → | stmt1 … stmtn exp [;] | (n ≥ 0) |
stmt | → | exp ; | |
| | pat <- exp ; | ||
| | let decls ; | ||
| | ; | (empty statement) |
A do expression provides a more conventional syntax for monadic programming. It allows an expression such as
to be written in a more traditional way as:
Translation: Do expressions satisfy these identities, which may be used as a translation into the kernel, after eliminating empty stmts:
do {e} | = | e |
do {e;stmts} | = | e >> do {stmts} |
do {p <- e; stmts} | = | let ok p = do {stmts} |
ok _ = fail "..." | ||
in e >>= ok | ||
do {let decls; stmts} | = | let decls in do {stmts} |
The ellipsis "..." stands for a compiler-generated error message, passed to fail, preferably giving some indication of the location of the pattern-match failure; the functions >>, >>=, and fail are operations in the class Monad, as defined in the Prelude; and ok is a fresh identifier.
As indicated by the translation of do, variables bound by let have fully polymorphic types while those defined by <- are lambda bound and are thus monomorphic.
A datatype declaration may optionally define field labels (see Section 4.2.1). These field labels can be used to construct, select from, and update fields in a manner that is independent of the overall structure of the datatype.
Different datatypes cannot share common field labels in the same scope. A field label can be used at most once in a constructor. Within a datatype, however, a field label can be used in more than one constructor provided the field has the same typing in all constructors. To illustrate the last point, consider:
Here S is legal but T is not, because y is given inconsistent typings in the latter.
aexp | → | qvar |
Field labels are used as selector functions. When used as a variable, a field label serves as a function that extracts the field from an object. Selectors are top level bindings and so they may be shadowed by local variables but cannot conflict with other top level bindings of the same name. This shadowing only affects selector functions; in record construction (Section 3.15.2) and update (Section 3.15.3), field labels cannot be confused with ordinary variables.
Translation: A field label f introduces a selector function defined as:
f x | = | case x of { C1 p11 … p1k -> e1 ;… ; Cn pn1 … pnk -> en } |
where C1 … Cn are all the constructors of the datatype containing a field labeled with f, pij is y when f labels the jth component of Ci or _ otherwise, and ei is y when some field in Ci has a label of f or undefined otherwise.
aexp | → | qcon { fbind1 , … , fbindn } | (labeled construction, n ≥ 0) |
fbind | → | qvar = exp |
A constructor with labeled fields may be used to construct a value in which the components are specified by name rather than by position. Unlike the braces used in declaration lists, these are not subject to layout; the { and } characters must be explicit. (This is also true of field updates and field patterns.) Construction using field labels is subject to the following constraints:
The expression F {}, where F is a data constructor, is legal whether or not F was declared with record syntax (provided F has no strict fields — see the fourth bullet above); it denotes F ⊥1 … ⊥n, where n is the arity of F.
Translation: In the binding f = v, the field f labels v.
C { bs } | = | C (pick1C bs undefined) … (pickkC bs undefined) |
where k is the arity of C.
The auxiliary function pickiC bs d is defined as follows:
If the ith component of a constructor C has the field label f, and if f = v appears in the binding list bs, then pickiC bs d is v. Otherwise, pickiC bs d is the default value d.
aexp | → | aexp⟨qcon⟩ { fbind1 , … , fbindn } | (labeled update, n ≥ 1) |
Values belonging to a datatype with field labels may be non-destructively updated. This creates a new value in which the specified field values replace those in the existing value. Updates are restricted in the following ways:
Translation: Using the prior definition of pick,
e { bs } | = | case e of |
C1 v1 … vk1 -> C1 (pick1C1 bs v1) … (pickk 1C1 bs v k1) | ||
... | ||
Cj v1 … vkj -> Cj (pick1Cj bs v1) … (pickk jCj bs v kj) | ||
_ -> error "Update error" | ||
where {C1,…,Cj} is the set of constructors containing all labels in bs, and ki is the arity of Ci.
Here are some examples using labeled fields:
Expression | Translation |
C1 {f1 = 3} | C1 3 undefined |
C2 {f1 = 1, f4 = 'A', f3 = 'B'} | C2 1 'B' 'A' |
x {f1 = 1} | case x of C1 _ f2 -> C1 1 f2 |
C2 _ f3 f4 -> C2 1 f3 f4 | |
exp | → | exp :: [context =>] type |
Expression type-signatures have the form e :: t, where e is an expression and t is a type (Section 4.1.2); they are used to type an expression explicitly and may be used to resolve ambiguous typings due to overloading (see Section 4.3.4). The value of the expression is just that of exp. As with normal type signatures (see Section 4.4.1), the declared type may be more specific than the principal type derivable from exp, but it is an error to give a type that is more general than, or not comparable to, the principal type.
Patterns appear in lambda abstractions, function definitions, pattern bindings, list comprehensions, do expressions, and case expressions. However, the first five of these ultimately translate into case expressions, so defining the semantics of pattern matching for case expressions is sufficient.
pat | → | lpat qconop pat | (infix constructor) |
| | lpat | ||
lpat | → | apat | |
| | - (integer | float) | (negative literal) | |
| | gcon apat1 … apatk | (arity gcon = k, k ≥ 1) | |
apat | → | var [ @ apat] | (as pattern) |
| | gcon | (arity gcon = 0) | |
| | qcon { fpat1 , … , fpatk } | (labeled pattern, k ≥ 0) | |
| | literal | ||
| | _ | (wildcard) | |
| | ( pat ) | (parenthesized pattern) | |
| | ( pat1 , … , patk ) | (tuple pattern, k ≥ 2) | |
| | [ pat1 , … , patk ] | (list pattern, k ≥ 1) | |
| | ~ apat | (irrefutable pattern) | |
fpat | → | qvar = pat |
The arity of a constructor must match the number of sub-patterns associated with it; one cannot match against a partially-applied constructor.
All patterns must be linear —no variable may appear more than once. For example, this definition is illegal:
Patterns of the form var@pat are called as-patterns, and allow one to use var as a name for the value being matched by pat. For example,
is equivalent to:
Patterns of the form _ are wildcards and are useful when some part of a pattern is not referenced on the right-hand-side. It is as if an identifier not used elsewhere were put in its place. For example,
is equivalent to:
Patterns are matched against values. Attempting to match a pattern can have one of three results: it may fail; it may succeed, returning a binding for each variable in the pattern; or it may diverge (i.e. return ⊥). Pattern matching proceeds from left to right, and outside to inside, according to the following rules:
Operationally, this means that no matching is done on a ~apat pattern until one of the variables in apat is used. At that point the entire pattern is matched against the value, and if the match fails or diverges, so does the overall computation.
That is, constructors associated with newtype serve only to change the type of a value.
The interpretation of numeric literals is exactly as described in Section 3.2; that is, the overloaded function fromInteger or fromRational is applied to an Integer or Rational literal (resp) to convert it to the appropriate type.
Aside from the obvious static type constraints (for example, it is a static error to match a character against a boolean), the following static class constraints hold:
It is sometimes helpful to distinguish two kinds of patterns. Matching an irrefutable pattern is non-strict: the pattern matches even if the value to be matched is ⊥. Matching a refutable pattern is strict: if the value to be matched is ⊥ the match diverges. The irrefutable patterns are as follows: a variable, a wildcard, N apat where N is a constructor defined by newtype and apat is irrefutable (see Section 4.2.3), var@apat where apat is irrefutable, or of the form ~apat (whether or not apat is irrefutable). All other patterns are refutable.
Here are some examples:
(\ ~(x,y) -> 0) ⊥ ⇒ 0 |
(\ (x,y) -> 0) ⊥ ⇒ ⊥ |
(\ ~[x] -> 0) [] ⇒ 0 |
(\ ~[x] -> x) [] ⇒ ⊥ |
(\ ~[x,~(a,b)] -> x) [(0,1),⊥] ⇒ (0,1) |
(\ ~[x, (a,b)] -> x) [(0,1),⊥] ⇒ ⊥ |
(\ (x:xs) -> x:x:xs) ⊥ ⇒ ⊥ |
(\ ~(x:xs) -> x:x:xs) ⊥ ⇒ ⊥:⊥:⊥ |
These examples illustrate the difference in pattern matching between types defined by data and newtype:
(\ (N True) -> True) ⊥ ⇒ ⊥ |
(\ (D True) -> True) ⊥ ⇒ ⊥ |
(\ ~(D True) -> True) ⊥ ⇒ True |
Additional examples may be found in Section 4.2.3.
Top level patterns in case expressions and the set of top level patterns in function or pattern bindings may have zero or more associated guards. See Section 3.13 for the syntax and semantics of guards.
The guard semantics have an influence on the strictness characteristics of a function or case expression. In particular, an otherwise irrefutable pattern may be evaluated because of a guard. For example, in
both a and y will be evaluated by == in the guard.
The semantics of all pattern matching constructs other than case expressions are defined by giving identities that relate those constructs to case expressions. The semantics of case expressions themselves are in turn given as a series of identities, in Figures 3.1–3.3. Any implementation should behave so that these identities hold; it is not expected that it will use them directly, since that would generate rather inefficient code.
(a) | case e of { alts } = (\v -> case v of { alts }) e |
where v is a new variable | |
(b) | case v of { p 1 match1; … ; pn matchn } |
= case v of { p1 match1 ; | |
_ -> … case v of { | |
pn matchn ; | |
_ -> error "No match" }…} | |
where each matchi has the form: | |
| gsi,1 -> ei,1 ; … ; | gsi,mi -> ei,mi where { declsi } | |
(c) | case v of { p | gs1 -> e1 ; … |
| gsn -> en where { decls } | |
_ -> e′ } | |
= case e′ of { y -> | |
case v of { | |
p -> let { decls } in | |
case () of { | |
() | gs1 -> e1; | |
_ -> … case () of { | |
() | gsn -> en; | |
_ -> y } … } | |
_ -> y }} | |
where y is a new variable | |
(d) | case v of { ~p -> e; _ -> e′ } |
= (\x1 … xn -> e ) (case v of { p-> x1 })… (case v of { p -> xn}) | |
where x1,…,xn are all the variables in p | |
(e) | case v of { x@p -> e; _ -> e′ } |
= case v of { p -> ( \ x -> e ) v ; _ -> e′ } | |
(f) | case v of { _ -> e; _ -> e′ } = e |
(g) | case v of { K p1…pn -> e; _ -> e′ } |
= case v of { | |
K x1…xn -> case x1 of { | |
p1 -> … case xn of { pn -> e ; _ -> e′ } … | |
_ -> e′ } | |
_ -> e′ } | |
at least one of p1,…,pn is not a variable; x1,…,xn are new variables | |
(h) | case v of { k -> e; _ -> e′ } = if (v==k) then e else e′ |
where k is a numeric, character, or string literal | |
(i) | case v of { x -> e; _ -> e′ } = case v of { x -> e } |
(j) | case v of { x -> e } = ( \ x -> e ) v |
(k) | case N v of { N p -> e; _ -> e′ } |
= case v of { p -> e; _ -> e′ } | |
where N is a newtype constructor | |
(l) | case ⊥ of { N p -> e; _ -> e′ } = case ⊥ of { p -> e } |
where N is a newtype constructor | |
(m) | case v of { K { f1 = p1 , f2 = p2 , … } -> e ; _ -> e′ } |
= case e′ of { | |
y -> | |
case v of { | |
K { f1 = p1 } -> | |
case v of { K { f2 = p2 , … } -> e ; _ -> y }; | |
_ -> y }} | |
where f1, f2, … are fields of constructor K; y is a new variable | |
(n) | case v of { K { f = p } -> e ; _ -> e′ } |
= case v of { | |
K p1 … pn -> e ; _ -> e′ } | |
where pi is p if f labels the ith component of K, _ otherwise | |
(o) | case v of { K {} -> e ; _ -> e′ } |
= case v of { | |
K _… _ -> e ; _ -> e′ } | |
(p) | case (K′ e1 … em) of { K x1 … xn -> e; _ -> e′ } = e′ |
where K and K′ are distinct data constructors of arity n and m, respectively | |
(q) | case (K e1 … en) of { K x1 … xn -> e; _ -> e′ } |
= (\x1 … xn -> e) e1 … en | |
where K is a data constructor of arity n | |
(r) | case ⊥ of { K x1 … xn -> e; _ -> e′ } = ⊥ |
where K is a data constructor of arity n | |
(s) | case () of { () | g1, …, gn -> e; _ -> e′ } |
= case () of { | |
() | g1 -> … case () of { | |
() | gn -> e; | |
_ -> e′ } … | |
_ -> e′ } | |
where y is a new variable | |
(t) | case () of { () | p <- e0 -> e; _ -> e′ } |
= case e0 of { p -> e; _ -> e′ } | |
(u) | case () of { () | let decls -> e; _ -> e′ } |
= let decls in e | |
(v) | case () of { () | e0 -> e; _ -> e′ } |
= if e0 then e else e′ | |
In Figures 3.1–3.3: e, e′ and ei are expressions; gi and gsi are guards and sequences of guards respecively; p and pi are patterns; v, x, and xi are variables; K and K′ are algebraic datatype (data) constructors (including tuple constructors); and N is a newtype constructor.
Rule (b) matches a general source-language case expression, regardless of whether it actually includes guards—if no guards are written, then True is substituted for the guards gsi,j in the matchi forms. Subsequent identities manipulate the resulting case expression into simpler and simpler forms.
Rule (h) in Figure 3.2 involves the overloaded operator ==; it is this rule that defines the meaning of pattern matching against overloaded constants.
These identities all preserve the static semantics. Rules (d), (e), (j), and (q) use a lambda rather than a let; this indicates that variables bound by case are monomorphically typed (Section 4.1.4).
In this chapter, we describe the syntax and informal semantics of Haskell declarations.
module | → | module modid [exports] where body | |
| | body | ||
body | → | { impdecls ; topdecls } | |
| | { impdecls } | ||
| | { topdecls } | ||
topdecls | → | topdecl1 ; … ; topdecln | (n ≥ 1) |
topdecl | → | type simpletype = type | |
| | data [context =>] simpletype [= constrs] [deriving] | ||
| | newtype [context =>] simpletype = newconstr [deriving] | ||
| | class [scontext =>] tycls tyvar [where cdecls] | ||
| | instance [scontext =>] qtycls inst [where idecls] | ||
| | default (type1 , … , typen) | (n ≥ 0) | |
| | foreign fdecl | ||
| | decl | ||
decls | → | { decl1 ; … ; decln } | (n ≥ 0) |
decl | → | gendecl | |
| | (funlhs | pat) rhs | ||
cdecls | → | { cdecl1 ; … ; cdecln } | (n ≥ 0) |
cdecl | → | gendecl | |
| | (funlhs | var) rhs | ||
idecls | → | { idecl1 ; … ; idecln } | (n ≥ 0) |
idecl | → | (funlhs | var) rhs | |
| | (empty) | ||
gendecl | → | vars :: [context =>] type | (type signature) |
| | fixity [integer] ops | (fixity declaration) | |
| | (empty declaration) | ||
ops | → | op1 , … , opn | (n ≥ 1) |
vars | → | var1 , … , varn | (n ≥ 1) |
fixity | → | infixl | infixr | infix |
The declarations in the syntactic category topdecls are only allowed at the top level of a Haskell module (see Chapter 5), whereas decls may be used either at the top level or in nested scopes (i.e. those within a let or where construct).
For exposition, we divide the declarations into three groups: user-defined datatypes, consisting of type, newtype, and data declarations (Section 4.2); type classes and overloading, consisting of class, instance, and default declarations (Section 4.3); and nested declarations, consisting of value bindings, type signatures, and fixity declarations (Section 4.4).
Haskell has several primitive datatypes that are “hard-wired” (such as integers and floating-point numbers), but most “built-in” datatypes are defined with normal Haskell code, using normal type and data declarations. These “built-in” datatypes are described in detail in Section 6.1.
Haskell uses a traditional Hindley-Milner polymorphic type system to provide a static type semantics [4, 6], but the type system has been extended with type classes (or just classes) that provide a structured way to introduce overloaded functions.
A class declaration (Section 4.3.1) introduces a new type class and the overloaded operations that must be supported by any type that is an instance of that class. An instance declaration (Section 4.3.2) declares that a type is an instance of a class and includes the definitions of the overloaded operations—called class methods—instantiated on the named type.
For example, suppose we wish to overload the operations (+) and negate on types Int and Float. We introduce a new type class called Num:
This declaration may be read “a type a is an instance of the class Num if there are class methods (+) and negate, of the given types, defined on it.”
We may then declare Int and Float to be instances of this class:
where addInt, negateInt, addFloat, and negateFloat are assumed in this case to be primitive functions, but in general could be any user-defined function. The first declaration above may be read “Int is an instance of the class Num as witnessed by these definitions (i.e. class methods) for (+) and negate.”
More examples of type classes can be found in the papers by Jones [8] or Wadler and Blott [13]. The term ‘type class’ was used to describe the original Haskell 1.0 type system; ‘constructor class’ was used to describe an extension to the original type classes. There is no longer any reason to use two different terms: in this report, ‘type class’ includes both the original Haskell type classes and the constructor classes introduced by Jones.
To ensure that they are valid, type expressions are classified into different kinds, which take one of two possible forms:
Kind inference checks the validity of type expressions in a similar way that type inference checks the validity of value expressions. However, unlike types, kinds are entirely implicit and are not a visible part of the language. Kind inference is discussed in Section 4.6.
type | → | btype [-> type] | (function type) |
btype | → | [btype] atype | (type application) |
atype | → | gtycon | |
| | tyvar | ||
| | ( type1 , … , typek ) | (tuple type, k ≥ 2) | |
| | [ type ] | (list type) | |
| | ( type ) | (parenthesised constructor) | |
gtycon | → | qtycon | |
| | () | (unit type) | |
| | [] | (list constructor) | |
| | (->) | (function constructor) | |
| | (,{,}) | (tupling constructors) |
The syntax for Haskell type expressions is given above. Just as data values are built using data constructors, type values are built from type constructors. As with data constructors, the names of type constructors start with uppercase letters. Unlike data constructors, infix type constructors are not allowed (other than (->)).
The main forms of type expression are as follows:
Special syntax is provided for certain built-in type constructors:
Use of the (->) and [] constants is described in more detail below.
For example, the type expression IO a can be understood as the application of a constant, IO, to the variable a. Since the IO type constructor has kind ∗→∗, it follows that both the variable a and the whole expression, IO a, must have kind ∗. In general, a process of kind inference (see Section 4.6) is needed to determine appropriate kinds for user-defined datatypes, type synonyms, and classes.
Special syntax is provided to allow certain type expressions to be written in a more traditional style:
These special syntactic forms always denote the built-in type constructors for functions, tuples, and lists, regardless of what is in scope. In a similar way, the prefix type constructors (->), [], (), (,), and so on, always denote the built-in type constructors; they cannot be qualified, nor mentioned in import or export lists (Chapter 5). (Hence the special production, “gtycon”, above.)
Although the list and tuple types have special syntax, their semantics is the same as the equivalent user-defined algebraic data types.
Notice that expressions and types have a consistent syntax. If ti is the type of expression or pattern ei, then the expressions (\ e1 -> e2), [e1], and (e1,e2) have the types (t1 -> t2), [t1], and (t1,t2), respectively.
With one exception (that of the distinguished type variable in a class declaration (Section 4.3.1)), the type variables in a Haskell type expression are all assumed to be universally quantified; there is no explicit syntax for universal quantification [4]. For example, the type expression a -> a denotes the type ∀ a. a → a. For clarity, however, we often write quantification explicitly when discussing the types of Haskell programs. When we write an explicitly quantified type, the scope of the ∀ extends as far to the right as possible; for example, ∀ a. a → a means ∀ a. (a → a).
context | → | class | |
| | ( class1 , … , classn ) | (n ≥ 0) | |
class | → | qtycls tyvar | |
| | qtycls ( tyvar atype1 … atypen ) | (n ≥ 1) | |
qtycls | → | [ modid . ] tycls | |
tycls | → | conid | |
tyvar | → | varid |
A class assertion has form qtycls tyvar, and indicates the membership of the type tyvar in the class qtycls. A class identifier begins with an uppercase letter. A context consists of zero or more class assertions, and has the general form ( C1 u1, …, Cn un ) where C1, …, Cn are class identifiers, and each of the u1, …, un is either a type variable, or the application of type variable to one or more types. The outer parentheses may be omitted when n = 1. In general, we use cx to denote a context and we write cx => t to indicate the type t restricted by the context cx. The context cx must only contain type variables referenced in t. For convenience, we write cx => t even if the context cx is empty, although in this case the concrete syntax contains no =>.
In this section, we provide informal details of the type system. (Wadler and Blott [13] and Jones [8] discuss type and constructor classes, respectively, in more detail.)
The Haskell type system attributes a type to each expression in the program. In general, a type is of the form ∀ u. cx ⇒ t, where u is a set of type variables u1, …, un. In any such type, any of the universally-quantified type variables ui that are free in cx must also be free in t. Furthermore, the context cx must be of the form given above in Section 4.1.3. For example, here are some valid types:
In the third type, the constraint Eq (f a) cannot be made simpler because f is universally quantified.
The type of an expression e depends on a type environment that gives types for the free variables in e, and a class environment that declares which types are instances of which classes (a type becomes an instance of a class only via the presence of an instance declaration or a deriving clause).
Types are related by a generalization preorder (specified below); the most general type, up to the equivalence induced by the generalization preorder, that can be assigned to a particular expression (in a given environment) is called its principal type. Haskell’s extended Hindley-Milner type system can infer the principal type of all expressions, including the proper use of overloaded class methods (although certain ambiguous overloadings could arise, as described in Section 4.3.4). Therefore, explicit typings (called type signatures) are usually optional (see Sections 3.16 and 4.4.1).
The type ∀ u. cx1 ⇒ t1 is more general than the type ∀ w. cx2 ⇒ t2 if and only if there is a substitution S whose domain is u such that:
A value of type ∀ u. cx ⇒ t, may be instantiated at types s if and only if the context cx[s∕u] holds. For example, consider the function double:
The most general type of double is ∀ a. Num a ⇒ a → a. double may be applied to values of type Int (instantiating a to Int), since Num Int holds, because Int is an instance of the class Num. However, double may not normally be applied to values of type Char, because Char is not normally an instance of class Num. The user may choose to declare such an instance, in which case double may indeed be applied to a Char.
In this section, we describe algebraic datatypes (data declarations), renamed datatypes (newtype declarations), and type synonyms (type declarations). These declarations may only appear at the top level of a module.
topdecl | → | data [context =>] simpletype [= constrs] [deriving] | |
simpletype | → | tycon tyvar1 … tyvark | (k ≥ 0) |
constrs | → | constr1 | … | constrn | (n ≥ 1) |
constr | → | con [!] atype1 … [!] atypek | (arity con = k, k ≥ 0) |
| | (btype | ! atype) conop (btype | ! atype) | (infix conop) | |
| | con { fielddecl1 , … , fielddecln } | (n ≥ 0) | |
fielddecl | → | vars :: (type | ! atype) | |
deriving | → | deriving (dclass | (dclass1, … , dclassn)) | (n ≥ 0) |
dclass | → | qtycls |
The precedence for constr is the same as that for expressions—normal constructor application has higher precedence than infix constructor application (thus a : Foo a parses as a : (Foo a)).
An algebraic datatype declaration has the form:
data cx => T u1 … uk = K1 t11 … t1k1 | | Kn tn1 … tnkn
where cx is a context. This declaration introduces a new type constructor T with zero or more constituent
data constructors K1, …, Kn. In this Report, the unqualified term “constructor” always means “data
constructor”.
The types of the data constructors are given by:
Ki :: ∀ u1 … uk. cxi ⇒ ti1 → → tiki → (T u1 … uk)
where cxi is the largest subset of cx that constrains only those type variables free in the types ti1, …, tiki. The type
variables u1 through uk must be distinct and may appear in cx and the tij; it is a static error for any other
type variable to appear in cx or on the right-hand-side. The new type constant T has a kind of the form
κ1 →… → κk →∗ where the kinds κi of the argument variables ui are determined by kind inference as described
in Section 4.6. This means that T may be used in type expressions with anywhere between 0 and k
arguments.
For example, the declaration
introduces a type constructor Set of kind ∗→∗, and constructors NilSet and ConsSet with types
NilSet | :: ∀ a. Set a |
ConsSet | :: ∀ a. Eq a ⇒ a → Set a → Set a |
the function f has inferred type Eq a => Set a -> a. The context in the data declaration has no other effect whatsoever.
The visibility of a datatype’s constructors (i.e. the “abstractness” of the datatype) outside of the module in which the datatype is defined is controlled by the form of the datatype’s name in the export list as described in Section 5.8.
The optional deriving part of a data declaration has to do with derived instances, and is described in Section 4.3.3.
Labelled Fields A data constructor of arity k creates an object with k components. These components are normally accessed positionally as arguments to the constructor in expressions or patterns. For large datatypes it is useful to assign field labels to the components of a data object. This allows a specific field to be referenced independently of its location within the constructor.
A constructor definition in a data declaration may assign labels to the fields of the constructor, using the record syntax (C { ... }). Constructors using field labels may be freely mixed with constructors without them. A constructor with associated field labels may still be used as an ordinary constructor; features using labels are simply a shorthand for operations using an underlying positional constructor. The arguments to the positional constructor occur in the same order as the labeled fields. For example, the declaration
defines a type and constructor identical to the one produced by
Operations using field labels are described in Section 3.15. A data declaration may use the same field label in multiple constructors as long as the typing of the field is the same in all cases after type synonym expansion. A label cannot be shared by more than one type in scope. Field names share the top level namespace with ordinary variables and class methods and must not conflict with other top level names in scope.
The pattern F {} matches any value built with constructor F, whether or not F was declared with record syntax.
Strictness Flags Whenever a data constructor is applied, each argument to the constructor is evaluated if and only if the corresponding type in the algebraic datatype declaration has a strictness flag, denoted by an exclamation point, “!”. Lexically, “!” is an ordinary varsym not a reservedop; it has special significance only in the context of the argument types of a data declaration.
Translation: A declaration of the form data cx => T u1 … uk = … | K s1 … sn | … where each si is either of the form !ti or ti, replaces every occurrence of K in an expression by (\ x1 … xn -> ( ((K op1 x1) op2 x2) … ) opn xn) where opi is the non-strict apply function $ if si is of the form ti, and opi is the strict apply function $! (see Section 6.2) if si is of the form ! ti. Pattern matching on K is not affected by strictness flags.
topdecl | → | type simpletype = type | |
simpletype | → | tycon tyvar1 … tyvark | (k ≥ 0) |
A type synonym declaration introduces a new type that is equivalent to an old type. It has the form type T u1 … uk = t which introduces a new type constructor, T . The type (T t1 …tk) is equivalent to the type t[t1∕u1, …, tk∕uk]. The type variables u1 through uk must be distinct and are scoped only over t; it is a static error for any other type variable to appear in t. The kind of the new type constructor T is of the form κ1 →… → κk → κ where the kinds κi of the arguments ui and κ of the right hand side t are determined by kind inference as described in Section 4.6. For example, the following definition can be used to provide an alternative way of writing the list type constructor:
Type constructor symbols T introduced by type synonym declarations cannot be partially applied; it is a static error to use T without the full number of arguments.
Although recursive and mutually recursive datatypes are allowed, this is not so for type synonyms, unless an algebraic datatype intervenes. For example,
is allowed, whereas
is not. Similarly, type Rec a = [Rec a] is not allowed.
Type synonyms are a convenient, but strictly syntactic, mechanism to make type signatures more readable. A synonym and its definition are completely interchangeable, except in the instance type of an instance declaration (Section 4.3.2).
topdecl | → | newtype [context =>] simpletype = newconstr [deriving] | |
newconstr | → | con atype | |
| | con { var :: type } | ||
simpletype | → | tycon tyvar1 … tyvark | (k ≥ 0) |
A declaration of the form newtype cx => T u1 … uk = N t introduces a new type whose representation is the same as an existing type. The type (T u1… uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 … uk). Using N in a pattern coerces a value from type (T u1 … uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object.
New instances (see Section 4.3.2) can be defined for a type defined by newtype but may not be defined for a type synonym. A type created by newtype differs from an algebraic datatype in that the representation of an algebraic datatype has an extra level of indirection. This difference may make access to the representation less efficient. The difference is reflected in different rules for pattern matching (see Section 3.17). Unlike algebraic datatypes, the newtype constructor N is unlifted, so that N ⊥ is the same as ⊥.
The following examples clarify the differences between data (algebraic datatypes), type (type synonyms), and newtype (renaming types.) Given the declarations
the expressions (d1 ⊥), (d2 ⊥) and (d2 (D2 ⊥)) are all equivalent to ⊥, whereas (n ⊥), (n (N ⊥)), (d1 (D1 ⊥)) and (s ⊥) are all equivalent to 42. In particular, (N ⊥) is equivalent to ⊥ while (D1 ⊥) is not equivalent to ⊥.
The optional deriving part of a newtype declaration is treated in the same way as the deriving component of a data declaration; see Section 4.3.3.
A newtype declaration may use field-naming syntax, though of course there may only be one field. Thus:
brings into scope both a constructor and a de-constructor:
topdecl | → | class [scontext =>] tycls tyvar [where cdecls] | |
scontext | → | simpleclass | |
| | ( simpleclass1 , … , simpleclassn ) | (n ≥ 0) | |
simpleclass | → | qtycls tyvar | |
cdecls | → | { cdecl1 ; … ; cdecln } | (n ≥ 0) |
cdecl | → | gendecl | |
| | (funlhs | var) rhs |
A class declaration introduces a new class and the operations (class methods) on it. A class declaration has the general form:
class cx => C u where cdecls |
The superclass relation must not be cyclic; i.e. it must form a directed acyclic graph.
The cdecls part of a class declaration contains three kinds of declarations:
The type of the top-level class method vi is: vi :: ∀u,w. (Cu,cxi) ⇒ ti The ti must mention u; it may mention type variables w other than u, in which case the type of vi is polymorphic in both u and w. The cxi may constrain only w; in particular, the cxi may not constrain u. For example:
Here the type of op is ∀ a, b. (Foo a, Num b) ⇒ a → b → a.
is not permitted, because the left hand side of the default declaration is a pattern.
Other than these cases, no other declarations are permitted in cdecls.
A class declaration with no where part may be useful for combining a collection of classes into a larger one that inherits all of the class methods in the original ones. For example:
In such a case, if a type is an instance of all superclasses, it is not automatically an instance of the subclass, even though the subclass has no immediate class methods. The instance declaration must be given explicitly with no where part.
topdecl | → | instance [scontext =>] qtycls inst [where idecls] | |
inst | → | gtycon | |
| | ( gtycon tyvar1 … tyvark ) | (k ≥ 0, tyvars distinct) | |
| | ( tyvar1 , … , tyvark ) | (k ≥ 2, tyvars distinct) | |
| | [ tyvar ] | ||
| | ( tyvar1 -> tyvar2 ) | (tyvar1 and tyvar2 distinct) | |
idecls | → | { idecl1 ; … ; idecln } | (n ≥ 0) |
idecl | → | (funlhs | var) rhs | |
| | (empty) |
An instance declaration introduces an instance of a class. Let class cx => C u where { cbody } be a class declaration. The general form of the corresponding instance declaration is: instance cx′ => C (T u1 … uk) where { d } where k ≥ 0. The type (T u1 … uk) must take the form of a type constructor T applied to simple type variables u1, … uk; furthermore, T must not be a type synonym, and the ui must all be distinct.
This prohibits instance declarations such as:
The declarations d may contain bindings only for the class methods of C. It is illegal to give a binding for a class method that is not in scope, but the name under which it is in scope is immaterial; in particular, it may be a qualified name. (This rule is identical to that used for subordinate names in export lists — Section 5.2.) For example, this is legal, even though range is in scope only with the qualified name Data.Ix.range.
The declarations may not contain any type signatures or fixity declarations, since these have already been given in the class declaration. As in the case of default class methods (Section 4.3.1), the method declarations must take the form of a variable or function definition.
If no binding is given for some class method then the corresponding default class method in the class declaration is used (if present); if such a default does not exist then the class method of this instance is bound to undefined and no compile-time error results.
An instance declaration that makes the type T to be an instance of class C is called a C-T instance declaration and is subject to these static restrictions:
In fact, except in pathological cases it is possible to infer from the instance declaration the most general instance context cx′ satisfying the above two constraints, but it is nevertheless mandatory to write an explicit instance context.
The following example illustrates the restrictions imposed by superclass instances:
This example is valid Haskell. Since Foo is a superclass of Bar, the second instance declaration is only valid if [a] is an instance of Foo under the assumption Num a. The first instance declaration does indeed say that [a] is an instance of Foo under this assumption, because Eq and Show are superclasses of Num.
If the two instance declarations instead read like this:
then the program would be invalid. The second instance declaration is valid only if [a] is an instance of Foo under the assumptions (Eq a, Show a). But this does not hold, since [a] is only an instance of Foo under the stronger assumption Num a.
Further examples of instance declarations may be found in Chapter 9.
As mentioned in Section 4.2.1, data and newtype declarations contain an optional deriving form. If the form is included, then derived instance declarations are automatically generated for the datatype in each of the named classes. These instances are subject to the same restrictions as user-defined instances. When deriving a class C for a type T , instances for all superclasses of C must exist for T , either via an explicit instance declaration or by including the superclass in the deriving clause.
Derived instances provide convenient commonly-used operations for user-defined datatypes. For example, derived instances for datatypes in the class Eq define the operations == and /=, freeing the programmer from the need to define them.
The only classes in the Prelude for which derived instances are allowed are Eq, Ord, Enum, Bounded, Show, and Read, all mentioned in Figure 6.1. The precise details of how the derived instances are generated for each of these classes are provided in Chapter 11, including a specification of when such derived instances are possible. Classes defined by the standard libraries may also be derivable.
A static error results if it is not possible to derive an instance declaration over a class named in a deriving form. For example, not all datatypes can properly support class methods in Enum. It is also a static error to give an explicit instance declaration for a class that is also derived.
If the deriving form is omitted from a data or newtype declaration, then no instance declarations are derived for that datatype; that is, omitting a deriving form is equivalent to including an empty deriving form: deriving ().
topdecl | → | default (type1 , … , typen) | (n ≥ 0) |
A problem inherent with Haskell-style overloading is the possibility of an ambiguous type. For example, using the read and show functions defined in Chapter 11, and supposing that just Int and Bool are members of Read and Show, then the expression
is ambiguous, because the types for show and read,
show | :: ∀ a. Show a ⇒ a → String |
read | :: ∀ a. Read a ⇒ String → a |
We say that an expression e has an ambiguous type if, in its type ∀ u. cx ⇒ t, there is a type variable u in u that occurs in cx but not in t. Such types are invalid.
For example, the earlier expression involving show and read has an ambiguous type since its type is ∀ a. Show a, Read a ⇒ String.
Ambiguous types can only be circumvented by input from the user. One way is through the use of expression type-signatures as described in Section 3.16. For example, for the ambiguous expression given earlier, one could write:
which disambiguates the type.
Occasionally, an otherwise ambiguous expression needs to be made the same type as some variable, rather than being given a fixed type with an expression type-signature. This is the purpose of the function asTypeOf (Chapter 9): x ‘asTypeOf‘ y has the value of x, but x and y are forced to have the same type. For example,
(See Section 6.4.6 for a description of encodeFloat and exponent.)
Ambiguities in the class Num are most common, so Haskell provides another way to resolve them—with a default declaration: default (t1 , … , tn) where n ≥ 0, and each ti must be a type for which Num ti holds. In situations where an ambiguous type is discovered, an ambiguous type variable, v, is defaultable if:
Each defaultable variable is replaced by the first type in the default list that is an instance of all the ambiguous variable’s classes. It is a static error if no such type is found.
Only one default declaration is permitted per module, and its effect is limited to that module. If no default declaration is given in a module then it assumed to be:
The empty default declaration, default (), turns off all defaults in a module.
The following declarations may be used in any declaration list, including the top level of a module.
gendecl | → | vars :: [context =>] type | |
vars | → | var1 , …, varn | (n ≥ 1) |
A type signature specifies types for variables, possibly with respect to a context. A type signature has the form: v1, …, vn :: cx => t which is equivalent to asserting vi :: cx => t for each i from 1 to n. Each vi must have a value binding in the same declaration list that contains the type signature; i.e. it is invalid to give a type signature for a variable bound in an outer scope. Moreover, it is invalid to give more than one type signature for one variable, even if the signatures are identical.
As mentioned in Section 4.1.2, every type variable appearing in a signature is universally quantified over that signature, and hence the scope of a type variable is limited to the type signature that contains it. For example, in the following declarations
the a’s in the two type signatures are quite distinct. Indeed, these declarations contain a static error, since x does not have type ∀ a. a. (The type of x is dependent on the type of f; there is currently no way in Haskell to specify a signature for a variable with a dependent type; this is explained in Section 4.5.4.)
If a given program includes a signature for a variable f, then each use of f is treated as having the declared type. It is a static error if the same type cannot also be inferred for the defining occurrence of f.
If a variable f is defined without providing a corresponding type signature declaration, then each use of f outside its own declaration group (see Section 4.5) is treated as having the corresponding inferred, or principal type . However, to ensure that type inference is still possible, the defining occurrence, and all uses of f within its declaration group must have the same monomorphic type (from which the principal type is obtained by generalization, as described in Section 4.5.2).
For example, if we define
then the principal type is sqr :: ∀ a. Num a ⇒ a → a, which allows applications such as sqr 5 or sqr 0.1. It is also valid to declare a more specific type, such as
but now applications such as sqr 0.1 are invalid. Type signatures such as
are invalid, as they are more general than the principal type of sqr.
Type signatures can also be used to support polymorphic recursion. The following definition is pathological, but illustrates how a type signature can be used to specify a type more general than the one that would be inferred:
If we remove the signature declaration, the type of f will be inferred as T Int -> Int due to the first recursive call for which the argument to f is T Int. Polymorphic recursion allows the user to supply the more general type signature, T a -> a.
gendecl | → | fixity [integer] ops | |
fixity | → | infixl | infixr | infix | |
ops | → | op1 , … , opn | (n ≥ 1) |
op | → | varop | conop |
A fixity declaration gives the fixity and binding precedence of one or more operators. The integer in a fixity declaration must be in the range 0 to 9. A fixity declaration may appear anywhere that a type signature appears and, like a type signature, declares a property of a particular operator. Also like a type signature, a fixity declaration can only occur in the same sequence of declarations as the declaration of the operator itself, and at most one fixity declaration may be given for any operator. (Class methods are a minor exception; their fixity declarations can occur either in the class declaration itself or at top level.)
There are three kinds of fixity, non-, left- and right-associativity (infix, infixl, and infixr, respectively), and ten precedence levels, 0 to 9 inclusive (level 0 binds least tightly, and level 9 binds most tightly). If the digit is omitted, level 9 is assumed. Any operator lacking a fixity declaration is assumed to be infixl 9 (See Section 3 for more on the use of fixities). Table 4.1 lists the fixities and precedences of the operators defined in the Prelude.
Prec- | Left associative | Non-associative | Right associative |
edence | operators | operators | operators |
9 | !! | . | |
8 | ^ , ^^ , ⋆⋆ | ||
7 | ⋆, /, ‘div‘, | ||
‘mod‘, ‘rem‘, ‘quot‘ | |||
6 | +, - | ||
5 | :, ++ | ||
4 | ==, /=, <, <=, >, >=, | ||
‘elem‘, ‘notElem‘ | |||
3 | && | ||
2 | || | ||
1 | >>, >>= | ||
0 | $, $!, ‘seq‘ | ||
Fixity is a property of a particular entity (constructor or variable), just like its type; fixity is not a property of that entity’s name. For example:
Here, ‘Bar.op‘ is infixr 7, ‘Foo.op‘ is infix 3, and the nested definition of op in f’s right-hand side has the default fixity of infixl 9. (It would also be possible to give a fixity to the nested definition of ‘op‘ with a nested fixity declaration.)
We distinguish two cases within this syntax: a pattern binding occurs when the left hand side is a pat; otherwise, the binding is called a function binding. Either binding may appear at the top-level of a module or within a where or let construct.
A function binding binds a variable to a function value. The general form of a function binding for variable x is:
x | p11 … p1k | match1 |
… |
||
x | pn1 … pnk | matchn |
= ei where { declsi } |
| gsi1 | = ei1 |
|
… |
||
| gsimi | = eimi |
|
where { declsi }
|
| True = ei where { declsi } |
Note that all clauses defining a function must be contiguous, and the number of patterns in each clause must be the same. The set of patterns corresponding to each match must be linear—no variable is allowed to appear more than once in the entire set.
Alternative syntax is provided for binding functional values to infix operators. For example, these three function definitions are all equivalent:
Note that fixity resolution applies to the infix variants of the function binding in the same way as for expressions (Section 10.6). Applying fixity resolution to the left side of the equals in a function binding must leave the varop being defined at the top level. For example, if we are defining a new operator ## with precedence 6, then this definition would be illegal:
because : has precedence 5, so the left hand side resolves to (a ## x) : xs, and this cannot be a pattern binding because (a ## x) is not a valid pattern.
A pattern binding binds variables to values. A simple pattern binding has form p = e. The pattern p is matched “lazily” as an irrefutable pattern, as if there were an implicit ~ in front of it. See the translation in Section 3.12.
The general form of a pattern binding is p match, where a match is the same structure as for function bindings above; in other words, a pattern binding is:
p | | gs1 | = e1 |
| gs2 | = e2 |
|
… |
||
| gsm | = em |
|
where { decls }
|
The static semantics of the function and pattern bindings of a let expression or where clause are discussed in this section.
In general the static semantics are given by applying the normal Hindley-Milner inference rules. In order to increase polymorphism, these rules are applied to groups of bindings identified by a dependency analysis.
A binding b1 depends on a binding b2 in the same list of declarations if either
A declaration group is a minimal set of mutually dependent bindings. Hindley-Milner type inference is applied to each declaration group in dependency order. The order of declarations in where/let constructs is irrelevant.
The Hindley-Milner type system assigns types to a let-expression in two stages:
For example, consider the declaration
The type of g’s definition is a → (a,a). The generalization step attributes to g the polymorphic type ∀ a. a → (a,a), after which the typing of the “...” part can proceed.
When typing overloaded definitions, all the overloading constraints from a single declaration group are collected together, to form the context for the type of each variable declared in the group. For example, in the definition:
The types of the definitions of g1 and g2 are both a → a → String, and the accumulated constraints are Ord a (arising from the use of >), and Show a (arising from the use of show). The type variables appearing in this collection of constraints are called the constrained type variables.
The generalization step attributes to both g1 and g2 the type ∀ a. (Ord a, Show a) ⇒ a → a → String Notice that g2 is overloaded in the same way as g1 even though the occurrences of > and show are in the definition of g1.
If the programmer supplies explicit type signatures for more than one variable in a declaration group, the contexts of these signatures must be identical up to renaming of the type variables.
As mentioned in Section 4.1.4, the context of a type may constrain only a type variable, or the application of a type variable to one or more types. Hence, types produced by generalization must be expressed in a form in which all context constraints have be reduced to this “head normal form”. Consider, for example, the definition:
Its type is given by
and not
Even though the equality is taken at the list type, the context must be simplified, using the instance declaration for Eq on lists, before generalization. If no such instance is in scope, a static error occurs.
Here is an example that shows the need for a constraint of the form C (m t) where m is one of the type variables being generalized; that is, where the class C applies to a type expression that is not a type variable or a type constructor. Consider:
The type of return is Monad m => a -> m a; the type of (==) is Eq a => a -> a -> Bool. The type of f should be therefore (Monad m, Eq (m a)) => a -> m a -> Bool, and the context cannot be simplified further.
The instance declaration derived from a data type deriving clause (see Section 4.3.3) must, like any instance declaration, have a simple context; that is, all the constraints must be of the form C a, where a is a type variable. For example, in the type
the derived Show instance will produce a context Show (a b), which cannot be reduced and is not simple; thus a static error results.
Sometimes it is not possible to generalize over all the type variables used in the type of the definition. For example, consider the declaration
In an environment where x has type a, the type of g’s definition is a → b → ([a],b). The generalization step attributes to g the type ∀ b. a → b → ([a],b); only b can be universally quantified because a occurs in the type environment. We say that the type of g is monomorphic in the type variable a.
The effect of such monomorphism is that the first argument of all applications of g must be of a single type. For example, it would be valid for the “...” to be
(which would, incidentally, force x to have type Bool) but invalid for it to be
In general, a type ∀ u. cx ⇒ t is said to be monomorphic in the type variable a if a is free in ∀ u. cx ⇒ t.
It is worth noting that the explicit type signatures provided by Haskell are not powerful enough to express types that include monomorphic type variables. For example, we cannot write
because that would claim that g was polymorphic in both a and b (Section 4.4.1). In this program, g can only be given a type signature if its first argument is restricted to a type not involving type variables; for example
This signature would also cause x to have type Int.
Haskell places certain extra restrictions on the generalization step, beyond the standard Hindley-Milner restriction described above, which further reduces polymorphism in particular cases.
The monomorphism restriction depends on the binding syntax of a variable. Recall that a variable is bound by either a function binding or a pattern binding, and that a simple pattern binding is a pattern binding in which the pattern consists of only a single variable (Section 4.4.3).
The following two rules define the monomorphism restriction:
The usual Hindley-Milner restriction on polymorphism is that only type variables that do not occur free in the environment may be generalized. In addition, the constrained type variables of a restricted declaration group may not be generalized in the generalization step for that group. (Recall that a type variable is constrained if it must belong to some type class; see Section 4.5.2.)
Motivation Rule 1 is required for two reasons, both of which are fairly subtle.
Now consider the following expression:
It looks as if len should be computed only once, but without Rule 1 it might be computed twice, once at each of two different overloadings. If the programmer does actually wish the computation to be repeated, an explicit type signature may be added:
Recall that reads is a standard function whose type is given by the signature
Without Rule 1, n would be assigned the type ∀ a. Read a ⇒ a and s the type ∀ a. Read a ⇒ String. The latter is an invalid type, because it is inherently ambiguous. It is not possible to determine at what overloading to use s, nor can this be solved by adding a type signature for s. Hence, when non-simple pattern bindings are used (Section 4.4.3.2), the types inferred are always monomorphic in their constrained type variables, irrespective of whether a type signature is provided. In this case, both n and s are monomorphic in a.
The same constraint applies to pattern-bound functions. For example, in
both f and g are monomorphic regardless of any type signatures supplied for f or g.
Rule 2 is required because there is no way to enforce monomorphic use of an exported binding, except by performing type inference on modules outside the current module. Rule 2 states that the exact types of all the variables bound in a module must be determined by that module alone, and not by any modules that import it.
When type inference on module M1 is complete, len1 has the monomorphic type Num a => a (by Rule 1). Rule 2 now states that the monomorphic type variable a is ambiguous, and must be resolved using the defaulting rules of Section 4.3.4. Hence, len1 gets type Int, and its use in len2 is type-incorrect. (If the above code is actually what is wanted, a type signature on len1 would solve the problem.)
This issue does not arise for nested bindings, because their entire scope is visible to the compiler.
Consequences The monomorphism rule has a number of consequences for the programmer. Anything defined with function syntax usually generalizes as a function is expected to. Thus in
the function f may be used at any overloading in class Num. There is no danger of recomputation here. However, the same function defined with pattern syntax:
requires a type signature if f is to be fully overloaded. Many functions are most naturally defined using simple pattern bindings; the user must be careful to affix these with type signatures to retain full overloading. The standard prelude contains many examples of this:
Rule 1 applies to both top-level and nested definitions. Consider
Here, type inference finds that len1 has the monomorphic type (Num a => a); and the type variable a is resolved to Rational when performing type inference on len2.
This section describes the rules that are used to perform kind inference, i.e. to calculate a suitable kind for each type constructor and class appearing in a given program.
The first step in the kind inference process is to arrange the set of datatype, synonym, and class definitions into dependency groups. This can be achieved in much the same way as the dependency analysis for value declarations that was described in Section 4.5. For example, the following program fragment includes the definition of a datatype constructor D, a synonym S and a class C, all of which would be included in the same dependency group:
The kinds of variables, constructors, and classes within each group are determined using standard techniques of type inference and kind-preserving unification [8]. For example, in the definitions above, the parameter a appears as an argument of the function constructor (->) in the type of bar and hence must have kind ∗. It follows that both D and S must have kind ∗→∗ and that every instance of class C must have kind ∗.
It is possible that some parts of an inferred kind may not be fully determined by the corresponding definitions; in such cases, a default of ∗ is assumed. For example, we could assume an arbitrary kind κ for the a parameter in each of the following examples:
This would give kinds (κ →∗) → κ →∗ and κ →∗ for App and Tree, respectively, for any kind κ, and would require an extension to allow polymorphic kinds. Instead, using the default binding κ = ∗, the actual kinds for these two constructors are (∗→∗) →∗→∗ and ∗→∗, respectively.
Defaults are applied to each dependency group without consideration of the ways in which particular type constructor constants or classes are used in later dependency groups or elsewhere in the program. For example, adding the following definition to those above does not influence the kind inferred for Tree (by changing it to (∗→∗) →∗, for instance), and instead generates a static error because the kind of [], ∗→∗, does not match the kind ∗ that is expected for an argument of Tree:
This is important because it ensures that each constructor and class are used consistently with the same kind whenever they are in scope.
A module defines a collection of values, datatypes, type synonyms, classes, etc. (see Chapter 4), in an environment created by a set of imports (resources brought into scope from other modules). It exports some of these resources, making them available to other modules. We use the term entity to refer to a value, type, or class defined in, imported into, or perhaps exported from a module.
A Haskell program is a collection of modules, one of which, by convention, must be called Main and must export the value main. The value of the program is the value of the identifier main in module Main, which must be a computation of type IO τ for some type τ (see Chapter 7). When the program is executed, the computation main is performed, and its result (of type τ) is discarded.
Modules may reference other modules via explicit import declarations, each giving the name of a module to be imported and specifying its entities to be imported. Modules may be mutually recursive.
Modules are used for name-space control, and are not first class values. A multi-module Haskell program can be converted into a single-module program by giving each entity a unique name, changing all occurrences to refer to the appropriate unique name, and then concatenating all the module bodies1 . For example, here is a three-module program:
It is equivalent to the following single-module program:
Because they are allowed to be mutually recursive, modules allow a program to be partitioned freely without regard to dependencies.
A module name (lexeme modid) is a sequence of one or more identifiers beginning with capital letters, separated by dots, with no intervening spaces. For example, Data.Bool, Main and Foreign.Marshal.Alloc are all valid module names.
modid | → | {conid .} conid | (modules) |
Module names can be thought of as being arranged in a hierarchy in which appending a new component creates a child of the original module name. For example, the module Control.Monad.ST is a child of the Control.Monad sub-hierarchy. This is purely a convention, however, and not part of the language definition; in this report a modid is treated as a single identifier occupying a flat namespace.
There is one distinguished module, Prelude, which is imported into all modules by default (see Section 5.6), plus a set of standard library modules that may be imported as required (see Part II).
A module defines a mutually recursive scope containing declarations for value bindings, data types, type synonyms, classes, etc. (see Chapter 4).
module | → | module modid [exports] where body | |
| | body | ||
body | → | { impdecls ; topdecls } | |
| | { impdecls } | ||
| | { topdecls } | ||
impdecls | → | impdecl1 ; … ; impdecln | (n ≥ 1) |
topdecls | → | topdecl1 ; … ; topdecln | (n ≥ 1) |
A module begins with a header: the keyword module, the module name, and a list of entities (enclosed in round parentheses) to be exported. The header is followed by a possibly-empty list of import declarations (impdecls, Section 5.3) that specify modules to be imported, optionally restricting the imported bindings. This is followed by a possibly-empty list of top-level declarations (topdecls, Chapter 4).
An abbreviated form of module, consisting only of the module body, is permitted. If this is used, the header is assumed to be ‘module Main(main) where’. If the first lexeme in the abbreviated module is not a {, then the layout rule applies for the top level of the module.
exports | → | ( export1 , … , exportn [ , ] ) | (n ≥ 0) |
export | → | qvar | |
| | qtycon [(..) | ( cname1 , … , cnamen )] | (n ≥ 0) | |
| | qtycls [(..) | ( var1 , … , varn )] | (n ≥ 0) | |
| | module modid | ||
cname | → | var | con |
An export list identifies the entities to be exported by a module declaration. A module implementation may only export an entity that it declares, or that it imports from some other module. If the export list is omitted, all values, types and classes defined in the module are exported, but not those that are imported.
Entities in an export list may be named as follows:
In all cases, the (possibly-qualified) type constructor T must be in scope. The constructor and field names ci in the second form are unqualified; one of these subordinate names is legal if and only if (a) it names a constructor or field of T , and (b) the constructor or field is in scope in the module body regardless of whether it is in scope under a qualified or unqualified name. For example, the following is legal
Data constructors cannot be named in export lists except as subordinate names, because they cannot otherwise be distinguished from type constructors.
In all cases, C must be in scope. In the second form, one of the (unqualified) subordinate names fi is legal if and only if (a) it names a class method of C, and (b) the class method is in scope in the module body regardless of whether it is in scope under a qualified or unqualified name.
Here the module Queue uses the module name Stack in its export list to abbreviate all the entities imported from Stack.
A module can name its own local definitions in its export list using its own name in the “module M” syntax, because a local declaration brings into scope both a qualified and unqualified name (Section 5.5.1). For example:
Here module Mod1 exports all local definitions as well as those imported from Mod2 but not those imported from Mod3.
It is an error to use module M in an export list unless M is the module bearing the export list, or M is imported by at least one import declaration (qualified or unqualified).
Exports lists are cumulative: the set of entities exported by an export list is the union of the entities exported by the individual items of the list.
It makes no difference to an importing module how an entity was exported. For example, a field name f from data type T may be exported individually (f, item (1) above); or as an explicitly-named member of its data type (T(f), item (2)); or as an implicitly-named member (T(..), item(2)); or by exporting an entire module (module M, item (5)).
The unqualified names of the entities exported by a module must all be distinct (within their respective namespace). For example
There are no name clashes within module A itself, but there are name clashes in the export list between C.g and g (assuming C.g and g are different entities – remember, modules can import each other recursively), and between module B and C.f (assuming B.f and C.f are different entities).
impdecl | → | import [qualified] modid [as modid] [impspec] | |
| | (empty declaration) | ||
impspec | → | ( import1 , … , importn [ , ] ) | (n ≥ 0) |
| | hiding ( import1 , … , importn [ , ] ) | (n ≥ 0) | |
import | → | var | |
| | tycon [ (..) | ( cname1 , … , cnamen )] | (n ≥ 0) | |
| | tycls [(..) | ( var1 , … , varn )] | (n ≥ 0) | |
cname | → | var | con |
The entities exported by a module may be brought into scope in another module with an import declaration at the beginning of the module. The import declaration names the module to be imported and optionally specifies the entities to be imported. A single module may be imported by more than one import declaration. Imported names serve as top level declarations: they scope over the entire body of the module but may be shadowed by local non-top-level bindings.
The effect of multiple import declarations is strictly cumulative: an entity is in scope if it is imported by any of the import declarations in a module. The ordering of import declarations is irrelevant.
Lexically, the terminal symbols “as”, “qualified” and “hiding” are each a varid rather than a reservedid. They have special significance only in the context of an import declaration; they may also be used as variables.
Exactly which entities are to be imported can be specified in one of the following three ways:
The list must name only entities exported by the imported module. The list may be empty, in which case nothing except the instances is imported.
any constructor, class, or type named C is excluded. In contrast, using C in an import list names only a class or type.
It is an error to hide an entity that is not, in fact, exported by the imported module.
For each entity imported under the rules of Section 5.3.1, the top-level environment is extended. If the import declaration used the qualified keyword, only the qualified name of the entity is brought into scope. If the qualified keyword is omitted, then both the qualified and unqualified name of the entity is brought into scope. Section 5.5.1 describes qualified names in more detail.
The qualifier on the imported name is either the name of the imported module, or the local alias given in the as clause (Section 5.3.3) on the import statement. Hence, the qualifier is not necessarily the name of the module in which the entity was originally declared.
The ability to exclude the unqualified names allows full programmer control of the unqualified namespace: a locally defined entity can share the same name as a qualified import:
Imported modules may be assigned a local alias in the importing module using the as clause. For example, in
entities must be referenced using ‘C.’ as a qualifier instead of ‘VeryLongModuleName.’. This also allows a different module to be substituted for VeryLongModuleName without changing the qualifiers used for the imported module. It is legal for more than one module in scope to use the same qualifier, provided that all names can still be resolved unambiguously. For example:
This module is legal provided only that Foo and Baz do not both export f.
An as clause may also be used on an un-qualifiedimport statement:
This declaration brings into scope f and A.f.
To clarify the above import rules, suppose the module A exports x and y. Then this table shows what names are brought into scope by the specified import statement:
Import declaration | Names brought into scope |
import A | x, y, A.x, A.y |
import A() | (nothing) |
import A(x) | x, A.x |
import qualified A | A.x, A.y |
import qualified A() | (nothing) |
import qualified A(x) | A.x |
import A hiding () | x, y, A.x, A.y |
import A hiding (x) | y, A.y |
import qualified A hiding () | A.x, A.y |
import qualified A hiding (x) | A.y |
import A as B | x, y, B.x, B.y |
import A as B(x) | x, B.x |
import qualified A as B | B.x, B.y |
In all cases, all instance declarations in scope in module A are imported (Section 5.4).
Instance declarations cannot be explicitly named on import or export lists. All instances in scope within a module are always exported and any import brings all instances in from the imported module. Thus, an instance declaration is in scope if and only if a chain of import declarations leads to the module containing the instance declaration.
For example, import M() does not bring any new names in scope from module M, but does bring in any instances visible in M. A module whose only purpose is to provide instance declarations can have an empty export list. For example
A qualified name is written as modid.name (Section 2.4). A qualified name is brought into scope:
is legal. The defining occurrence must mention the unqualified name; therefore, it is illegal to write
If a module contains a bound occurrence of a name, such as f or A.f, it must be possible unambiguously to resolve which entity is thereby referred to; that is, there must be only one binding for f or A.f respectively.
It is not an error for there to exist names that cannot be so resolved, provided that the program does not mention those names. For example:
Consider the definition of tup.
The name occurring in a type signature or fixity declarations is always unqualified, and unambiguously refers to another declaration in the same declaration list (except that the fixity declaration for a class method can occur at top level — Section 4.4.2). For example, the following module is legal:
The local declaration for sin is legal, even though the Prelude function sin is implicitly in scope. The references to Prelude.sin and F.sin must both be qualified to make it unambiguous which sin is meant. However, the unqualified name sin in the type signature in the first line of F unambiguously refers to the local declaration for sin.
Every module in a Haskell program must be closed. That is, every name explicitly mentioned by the source code must be either defined locally or imported from another module. However, entities that the compiler requires for type checking or other compile time analysis need not be imported if they are not mentioned by name. The Haskell compilation system is responsible for finding any information needed for compilation without the help of the programmer. That is, the import of a variable x does not require that the datatypes and classes in the signature of x be brought into the module along with x unless these entities are referenced by name in the user program. The Haskell system silently imports any information that must accompany an entity for type checking or any other purposes. Such entities need not even be explicitly exported: the following program is valid even though T does not escape M1:
In this example, there is no way to supply an explicit type signature for y since T is not in scope. Whether or not T is explicitly exported, module M2 knows enough about T to correctly type check the program.
The type of an exported entity is unaffected by non-exported type synonyms. For example, in
the type of x is both T and Int; these are interchangeable even when T is not in scope. That is, the definition of T is available to any module that encounters it whether or not the name T is in scope. The only reason to export T is to allow other modules to refer it by name; the type checker finds the definition of T if needed whether or not it is exported.
Many of the features of Haskell are defined in Haskell itself as a library of standard datatypes, classes, and functions, called the “Standard Prelude.” In Haskell, the Prelude is contained in the module Prelude. There are also many predefined library modules, which provide less frequently used functions and types. For example, complex numbers, arrays, and most of the input/output are all part of the standard libraries. These are defined in Part II. Separating libraries from the Prelude has the advantage of reducing the size and complexity of the Prelude, allowing it to be more easily assimilated, and increasing the space of useful names available to the programmer.
Prelude and library modules differ from other modules in that their semantics (but not their implementation) are a fixed part of the Haskell language definition. This means, for example, that a compiler may optimize calls to functions in the Prelude without consulting the source code of the Prelude.
The Prelude module is imported automatically into all modules as if by the statement ‘import Prelude’, if and only if it is not imported with an explicit import declaration. This provision for explicit import allows entities defined in the Prelude to be selectively imported, just like those from any other module.
The semantics of the entities in Prelude is specified by a reference implementation of Prelude written in Haskell, given in Chapter 9. Some datatypes (such as Int) and functions (such as Int addition) cannot be specified directly in Haskell. Since the treatment of such entities depends on the implementation, they are not formally defined in Chapter 9. The implementation of Prelude is also incomplete in its treatment of tuples: there should be an infinite family of tuples and their instance declarations, but the implementation only gives a scheme.
Chapter 9 defines the module Prelude using several other modules: PreludeList, PreludeIO, and so on. These modules are not part of Haskell, and they cannot be imported separately. They are simply there to help explain the structure of the Prelude module; they should be considered part of its implementation, not part of the language definition.
The rules about the Prelude have been cast so that it is possible to use Prelude names for nonstandard purposes; however, every module that does so must have an import declaration that makes this nonstandard usage explicit. For example:
Module A redefines null, and contains an unqualified reference to null on the right hand side of nonNull. The latter would be ambiguous without the hiding(null) on the import Prelude statement. Every module that imports A unqualified, and then makes an unqualified reference to null must also resolve the ambiguous use of null just as A does. Thus there is little danger of accidentally shadowing Prelude names.
It is possible to construct and use a different module to serve in place of the Prelude. Other than the fact that it is implicitly imported, the Prelude is an ordinary Haskell module; it is special only in that some objects in the Prelude are referenced by special syntactic constructs. Redefining names used by the Prelude does not affect the meaning of these special constructs. For example, in
the explicit import Prelude() declaration prevents the automatic import of Prelude, while the declaration import MyPrelude brings the non-standard prelude into scope. The special syntax for tuples (such as (x,x) and (,)) and lists (such as [x] and []) continues to refer to the tuples and lists defined by the standard Prelude; there is no way to redefine the meaning of [x], for example, in terms of a different implementation of lists. On the other hand, the use of ++ is not special syntax, so it refers to ++ imported from MyPrelude.
It is not possible, however, to hide instance declarations in the Prelude. For example, one cannot define a new instance for Show Char.
Depending on the Haskell implementation used, separate compilation of mutually recursive modules may require that imported modules contain additional information so that they may be referenced before they are compiled. Explicit type signatures for all exported values may be necessary to deal with mutual recursion. The precise details of separate compilation are not defined by this report.
The ability to export a datatype without its constructors allows the construction of abstract datatypes (ADTs). For example, an ADT for stacks could be defined as:
Modules importing Stack cannot construct values of type StkType because they do not have access to the constructors of the type. Instead, they must use push, pop, and empty to construct such values.
It is also possible to build an ADT on top of an existing type by using a newtype declaration. For example, stacks can be defined with lists:
These types are defined by the Haskell Prelude. Numeric types are described in Section 6.4. When appropriate, the Haskell definition of the type is given. Some definitions may not be completely valid on syntactic grounds but they faithfully convey the meaning of the underlying type.
The boolean type Bool is an enumeration. The basic boolean functions are && (and), || (or), and not. The name otherwise is defined as True to make guarded expressions more readable.
The character type Char is an enumeration whose values represent Unicode characters [2]. The lexical syntax for characters is defined in Section 2.6; character literals are nullary constructors in the datatype Char. Type Char is an instance of the classes Read, Show, Eq, Ord, Enum, and Bounded. The toEnum and fromEnum functions, standard functions from class Enum, map characters to and from the Int type.
Note that ASCII control characters each have several representations in character literals: numeric escapes, ASCII mnemonic escapes, and the \^X notation. In addition, there are the following equivalences: \a and \BEL, \b and \BS, \f and \FF, \r and \CR, \t and \HT, \v and \VT, and \n and \LF.
A string is a list of characters:
Strings may be abbreviated using the lexical syntax described in Section 2.6. For example, "A string" abbreviates [ 'A',' ','s','t','r', 'i','n','g']
Lists are an algebraic datatype of two constructors, although with special syntax, as described in Section 3.7. The first constructor is the null list, written ‘[]’ (“nil”), and the second is ‘:’ (“cons”). The module PreludeList (see Section 9.1) defines many standard list functions. Arithmetic sequences and list comprehensions, two convenient syntaxes for special kinds of lists, are described in Sections 3.10 and 3.11, respectively. Lists are an instance of classes Read, Show, Eq, Ord, Monad, Functor, and MonadPlus.
Tuples are algebraic datatypes with special syntax, as defined in Section 3.8. Each tuple type has a single constructor. All tuples are instances of Eq, Ord, Bounded, Read, and Show (provided, of course, that all their component types are).
There is no upper bound on the size of a tuple, but some Haskell implementations may restrict the size of tuples, and limit the instances associated with larger tuples. However, every Haskell implementation must support tuples up to size 15, together with the instances for Eq, Ord, Bounded, Read, and Show. The Prelude and libraries define tuple functions such as zip for tuples up to a size of 7.
The constructor for a tuple is written by omitting the expressions surrounding the commas; thus (x,y) and (,) x y produce the same value. The same holds for tuple type constructors; thus, (Int,Bool,Int) and (,,) Int Bool Int denote the same type.
The following functions are defined for pairs (2-tuples): fst, snd, curry, and uncurry. Similar functions are not predefined for larger tuples.
The unit datatype () has one non-⊥ member, the nullary constructor (). See also Section 3.9.
Functions are an abstract type: no constructors directly create functional values. The following simple functions are found in the Prelude: id, const, (.), flip, ($), and until.
The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. IO is an instance of the Monad and Functor classes. Chapter 7 describes I/O operations.
IOError is an abstract type representing errors raised by I/O operations. It is an instance of Show and Eq. Values of this type are constructed by the various I/O functions and are not presented in any further detail in this report. The Prelude contains a few I/O functions (defined in Section 9.3), and Part II contains many more.
The Maybe type is an instance of classes Functor, Monad, and MonadPlus. The Ordering type is used by compare in the class Ord. The functions maybe and either are found in the Prelude.
Function application in Haskell is non-strict; that is, a function argument is evaluated only when required. Sometimes it is desirable to force the evaluation of a value, using the seq function:
The function seq is defined by the equations:
seq ⊥ b = ⊥ |
seq a b = b, if a ≠ ⊥ |
The operator $! is strict (call-by-value) application, and is defined in terms of seq. The Prelude also defines the $ operator to perform non-strict application.
The non-strict application operator $ may appear redundant, since ordinary application (f x) means the same as (f $ x). However, $ has low, right-associative binding precedence, so it sometimes allows parentheses to be omitted; for example:
It is also useful in higher-order situations, such as map ($ 0) xs, or zipWith ($) fs xs.
Figure 6.1 shows the hierarchy of Haskell classes defined in the Prelude and the Prelude types that are instances of these classes.
Default class method declarations (Section 4.3) are provided for many of the methods in standard classes. A comment with each class declaration in Chapter 9 specifies the smallest collection of method definitions that, together with the default declarations, provide a reasonable definition for all the class methods. If there is no such comment, then all class methods must be given to fully specify an instance.
The Eq class provides equality (==) and inequality (/=) methods. All basic datatypes except for functions and IO are instances of this class. Instances of Eq can be derived for any user-defined datatype whose constituents are also instances of Eq.
This declaration gives default method declarations for both /= and ==, each being defined in terms of the other. If an instance declaration for Eq defines neither == nor /=, then both will loop. If one is defined, the default method for the other will make use of the one that is defined. If both are defined, neither default method is used.
The Ord class is used for totally ordered datatypes. All basic datatypes except for functions, IO, and IOError, are instances of this class. Instances of Ord can be derived for any user-defined datatype whose constituent types are in Ord. The declared order of the constructors in the data declaration determines the ordering in derived Ord instances. The Ordering datatype allows a single comparison to determine the precise ordering of two objects.
The default declarations allow a user to create an Ord instance either with a type-specific compare function or with type-specific == and <= functions.
The Read and Show classes are used to convert values to or from strings. The Int argument to showsPrec and readsPrec gives the operator precedence of the enclosing context (see Section 11.4).
showsPrec and showList return a String-to-String function, to allow constant-time concatenation of its results using function composition. A specialised variant, show, is also provided, which uses precedence context zero, and returns an ordinary String. The method showList is provided to allow the programmer to give a specialised way of showing lists of values. This is particularly useful for the Char type, where values of type String should be shown in double quotes, rather than between square brackets.
Derived instances of Read and Show replicate the style in which a constructor is declared: infix constructors and field names are used on input and output. Strings produced by showsPrec are usually readable by readsPrec.
All Prelude types, except function types and IO types, are instances of Show and Read. (If desired, a programmer can easily make functions and IO types into (vacuous) instances of Show, by providing an instance declaration.)
For convenience, the Prelude provides the following auxiliary functions:
shows and reads use a default precedence of 0. The read function reads input from a string, which must be completely consumed by the input process.
The function lex :: ReadS String, used by read, is also part of the Prelude. It reads a single lexeme from the input, discarding initial white space, and returning the characters that constitute the lexeme. If the input string contains only white space, lex returns a single successful “lexeme” consisting of the empty string. (Thus lex "" = [("","")].) If there is no legal lexeme at the beginning of the input string, lex fails (i.e. returns []).
Class Enum defines operations on sequentially ordered types. The functions succ and pred return the successor and predecessor, respectively, of a value. The functions fromEnum and toEnum map values from a type in Enum to and from Int. The enumFrom... methods are used when translating arithmetic sequences (Section 3.10).
Instances of Enum may be derived for any enumeration type (types whose constructors have no fields); see Chapter 11.
For any type that is an instance of class Bounded as well as Enum, the following should hold:
The following Prelude types are instances of Enum:
For all four numeric types, succ adds 1, and pred subtracts 1. The conversions fromEnum and toEnum convert between the type and Int. In the case of Float and Double, the digits after the decimal point may be lost. It is implementation-dependent what fromEnum returns when applied to a value that is too large to fit in an Int.
For the types Int and Integer, the enumeration functions have the following meaning:
For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3 + i∕2 for positive increment i, or when they become less than e3 + i∕2 for negative i.
For all four of these Prelude numeric types, all of the enumFrom family of functions are strict in all their arguments.
The Functor class is used for types that can be mapped over. Lists, IO, and Maybe are in this class.
Instances of Functor should satisfy the following laws:
fmap id | = | id |
fmap (f . g) | = | fmap f . fmap g |
The Monad class defines the basic operations over a monad. See Chapter 7 for more information about monads.
“do” expressions provide a convenient syntax for writing monadic expressions (see Section 3.14). The fail method is invoked on pattern-match failure in a do expression.
In the Prelude, lists, Maybe, and IO are all instances of Monad. The fail method for lists returns the empty list [], for Maybe returns Nothing, and for IO raises a user exception in the IO monad (see Section 7.3).
Instances of Monad should satisfy the following laws:
return a >>= k | = | k a |
m >>= return | = | m |
m >>= (\x -> k x >>= h) | = | (m >>= k) >>= h |
fmap f xs | = | xs >>= return . f |
The Prelude provides the following auxiliary functions:
The Bounded class is used to name the upper and lower limits of a type. Ord is not a superclass of Bounded since types that are not totally ordered may also have upper and lower bounds. The types Int, Char, Bool, (), Ordering, and all tuples are instances of Bounded. The Bounded class may be derived for any enumeration type; minBound is the first constructor listed in the data declaration and maxBound is the last. Bounded may also be derived for single-constructor datatypes whose constituent types are in Bounded.
Haskell provides several kinds of numbers; the numeric types and the operations upon them have been heavily influenced by Common Lisp and Scheme. Numeric function names and operators are usually overloaded, using several type classes with an inclusion relation shown in Figure 6.1. The class Num of numeric types is a subclass of Eq, since all numbers may be compared for equality; its subclass Real is also a subclass of Ord, since the other comparison operations apply to all but complex numbers (defined in the Complex library). The class Integral contains integers of both limited and unlimited range; the class Fractional contains all non-integral types; and the class Floating contains all floating-point types, both real and complex.
The Prelude defines only the most basic numeric types: fixed sized integers (Int), arbitrary precision integers (Integer), single precision floating (Float), and double precision floating (Double). Other numeric types such as rationals and complex numbers are defined in libraries. In particular, the type Rational is a ratio of two Integer values, as defined in the Ratio library.
The default floating point operations defined by the Haskell Prelude do not conform to current language independent arithmetic (LIA) standards. These standards require considerably more complexity in the numeric structure and have thus been relegated to a library. Some, but not all, aspects of the IEEE floating point standard have been accounted for in Prelude class RealFloat.
The standard numeric types are listed in Table 6.1. The finite-precision integer type Int covers at least the range [ − 229, 229 − 1]. As Int is an instance of the Bounded class, maxBound and minBound can be used to determine the exact Int range defined by an implementation. Float is implementation-defined; it is desirable that this type be at least equal in range and precision to the IEEE single-precision type. Similarly, Double should cover IEEE double-precision. The results of exceptional conditions (such as overflow or underflow) on the fixed-precision numeric types are undefined; an implementation may choose error (⊥, semantically), a truncated value, or a special value such as infinity, indefinite, etc.
Type | Class | Description
|
Integer | Integral | Arbitrary-precision integers |
Int | Integral | Fixed-precision integers |
(Integral a) => Ratio a | RealFrac | Rational numbers |
Float | RealFloat | Real floating-point, single precision |
Double | RealFloat | Real floating-point, double precision |
(RealFloat a) => Complex a | Floating | Complex floating-point |
The standard numeric classes and other numeric functions defined in the Prelude are shown in Figures 6.2–6.3. Figure 6.1 shows the class dependencies and built-in types that are instances of the numeric classes.
The syntax of numeric literals is given in Section 2.5. An integer literal represents the application of the function fromInteger to the appropriate value of type Integer. Similarly, a floating literal stands for an application of fromRational to a value of type Rational (that is, Ratio Integer). Given the typings:
integer and floating literals have the typings (Num a) => a and (Fractional a) => a, respectively. Numeric literals are defined in this indirect way so that they may be interpreted as values of any appropriate numeric type. See Section 4.3.4 for a discussion of overloading ambiguity.
The infix class methods (+), (⋆), (-), and the unary function negate (which can also be written as a prefix minus sign; see section 3.4) apply to all numbers. The class methods quot, rem, div, and mod apply only to integral numbers, while the class method (/)applies only to fractional ones. The quot, rem, div, and mod class methods satisfy these laws if y is non-zero:
‘quot‘ is integer division truncated toward zero, while the result of ‘div‘ is truncated toward negative infinity. The quotRem class method takes a dividend and a divisor as arguments and returns a (quotient, remainder) pair; divMod is defined similarly:
Also available on integral numbers are the even and odd predicates:
Finally, there are the greatest common divisor and least common multiple functions. gcd x y is the greatest (positive) integer that divides both x and y; for example gcd (-3) 6 = 3, gcd (-3) (-6) = 3, gcd 0 4 = 4. gcd 0 0 raises a runtime error.
lcm x y is the smallest positive integer that both x and y divide.
The one-argument exponential function exp and the logarithm function log act on floating-point numbers and use base e. logBase a x returns the logarithm of x in base a. sqrt returns the principal square root of a floating-point number. There are three two-argument exponentiation operations: (^) raises any number to a nonnegative integer power, (^^) raises a fractional number to any integer power, and (⋆⋆)takes two floating-point arguments. The value of x^0 or x^^0 is 1 for any x, including zero; 0⋆⋆y is 1 if y is 0, and 0 otherwise.
A number has a magnitude and a sign. The functions abs and signum apply to any number and satisfy the law:
For real numbers, these functions are defined by:
Class Floating provides the circular and hyperbolic sine, cosine, and tangent functions and their inverses. Default implementations of tan, tanh, logBase, ⋆⋆, and sqrt are provided, but implementors are free to provide more accurate implementations.
Class RealFloat provides a version of arctangent taking two real floating-point arguments. For real floating x and y, atan2 y x computes the angle (from the positive x-axis) of the vector from the origin to the point (x,y). atan2 y x returns a value in the range [-pi, pi]. It follows the Common Lisp semantics for the origin when signed zeroes are supported. atan2 y 1, with y in a type that is RealFloat, should return the same value as atan y. A default definition of atan2 is provided, but implementors can provide a more accurate implementation.
The precise definition of the above functions is as in Common Lisp, which in turn follows Penfield’s proposal for APL [12]. See these references for discussions of branch cuts, discontinuities, and implementation.
The ceiling, floor, truncate, and round functions each take a real fractional argument and return an integral result. ceiling x returns the least integer not less than x, and floor x, the greatest integer not greater than x. truncate x yields the integer nearest x between 0 and x, inclusive. round x returns the nearest integer to x, the even integer if x is equidistant between two integers.
The function properFraction takes a real fractional number x and returns a pair (n,f) such that x = n + f, and: n is an integral number with the same sign as x; and f is a fraction f with the same type and sign as x, and with absolute value less than 1. The ceiling, floor, truncate, and round functions can be defined in terms of properFraction.
Two functions convert numbers to type Rational: toRational returns the rational equivalent of its real argument with full precision; approxRational takes two real fractional arguments x and ϵ and returns the simplest rational number within ϵ of x, where a rational p∕q in reduced form is simpler than another p′∕q′ if |p|≤|p′| and q ≤ q′. Every real interval contains a unique simplest rational; in particular, note that 0∕1 is the simplest rational of all.
The class methods of class RealFloat allow efficient, machine-independent access to the components of a floating-point number. The functions floatRadix, floatDigits, and floatRange give the parameters of a floating-point type: the radix of the representation, the number of digits of this radix in the significand, and the lowest and highest values the exponent may assume, respectively. The function decodeFloat applied to a real floating-point number returns the significand expressed as an Integer and an appropriately scaled exponent (an Int). If decodeFloat x yields (m,n), then x is equal in value to mbn, where b is the floating-point radix, and furthermore, either m and n are both zero or else bd−1 ≤|m| < bd, where d is the value of floatDigits x. encodeFloat performs the inverse of this transformation. The functions significand and exponent together provide the same information as decodeFloat, but rather than an Integer, significand x yields a value of the same type as x, scaled to lie in the open interval (−1,1). exponent 0 is zero. scaleFloat multiplies a floating-point number by an integer power of the radix.
The functions isNaN, isInfinite, isDenormalized, isNegativeZero, and isIEEE all support numbers represented using the IEEE standard. For non-IEEE floating point numbers, these may all return false.
Also available are the following coercion functions:
The I/O system in Haskell is purely functional, yet has all of the expressive power found in conventional programming languages. To achieve this, Haskell uses a monad to integrate I/O operations into a purely functional context.
The I/O monad used by Haskell mediates between the values natural to a functional language and the actions that characterize I/O operations and imperative programming in general. The order of evaluation of expressions in Haskell is constrained only by data dependencies; an implementation has a great deal of freedom in choosing this order. Actions, however, must be ordered in a well-defined manner for program execution – and I/O in particular – to be meaningful. Haskell’s I/O monad provides the user with a way to specify the sequential chaining of actions, and an implementation is obliged to preserve this order.
The term monad comes from a branch of mathematics known as category theory. From the perspective of a Haskell programmer, however, it is best to think of a monad as an abstract datatype. In the case of the I/O monad, the abstract values are the actions mentioned above. Some operations are primitive actions, corresponding to conventional I/O operations. Special operations (methods in the class Monad, see Section 6.3.6) sequentially compose actions, corresponding to sequencing operators (such as the semicolon) in imperative languages.
Although Haskell provides fairly sophisticated I/O facilities, as defined in the IO library, it is possible to write many Haskell programs using only the few simple functions that are exported from the Prelude, and which are described in this section.
All I/O functions defined here are character oriented. The treatment of the newline character will vary on different systems. For example, two characters of input, return and linefeed, may read as a single newline character. These functions cannot be used portably for binary I/O.
In the following, recall that String is a synonym for [Char] (Section 6.1.2).
Output Functions These functions write to the standard output device (this is normally the user’s terminal).
The print function outputs a value of any printable type to the standard output device. Printable types are those that are instances of class Show; print converts values to strings for output using the show operation and adds a newline.
For example, a program to print the first 20 integers and their powers of 2 could be written as:
Input Functions These functions read input from the standard input device (normally the user’s terminal).
The getChar operation raises an exception (Section 7.3) on end-of-file; a predicate isEOFError that identifies this exception is defined in the IO library. The getLine operation raises an exception under the same circumstances as hGetLine, defined the IO library.
The getContents operation returns all user input as a single string, which is read lazily as it is needed. The interact function takes a function of type String->String as its argument. The entire input from the standard input device is passed to this function as its argument, and the resulting string is output on the standard output device.
Typically, the read operation from class Read is used to convert the string to a value. The readIO function is similar to read except that it signals parse failure to the I/O monad instead of terminating the program. The readLn function combines getLine and readIO.
The following program simply removes all non-ASCII characters from its standard input and echoes the result on its standard output. (The isAscii function is defined in a library.)
Files These functions operate on files of characters. Files are named by strings using some implementation-specific method to resolve strings as file names.
The writeFile and appendFile functions write or append the string, their second argument, to the file, their first argument. The readFile function reads a file and returns the contents of the file as a string. The file is read lazily, on demand, as with getContents.
Note that writeFile and appendFile write a literal string to a file. To write a value of any printable type, as with print, use the show function to convert the value to a string first.
The type constructor IO is an instance of the Monad class. The two monadic binding functions, methods in the Monad class, are used to compose a series of I/O operations. The >> function is used where the result of the first operation is uninteresting, for example when it is (). The >>= operation passes the result of the first operation as an argument to the second operation.
For example,
is similar to the previous example using interact, but takes its input from "input-file" and writes its output to "output-file". A message is printed on the standard output before the program completes.
The do notation allows programming in a more imperative syntactic style. A slightly more elaborate version of the previous example would be:
The return function is used to define the result of an I/O operation. For example, getLine is defined in terms of getChar, using return to define the result:
The I/O monad includes a simple exception handling system. Any I/O operation may raise an exception instead of returning a result.
Exceptions in the I/O monad are represented by values of type IOError. This is an abstract type: its constructors are hidden from the user. The IO library defines functions that construct and examine IOError values. The only Prelude function that creates an IOError value is userError. User error values include a string describing the error.
Exceptions are raised and caught using the following functions:
The ioError function raises an exception; the catch function establishes a handler that receives any exception raised in the action protected by catch. An exception is caught by the most recent handler established by catch. These handlers are not selective: all exceptions are caught. Exception propagation must be explicitly provided in a handler by re-raising any unwanted exceptions. For example, in
the function f returns [] when an end-of-file exception occurs in g; otherwise, the exception is propagated to the next outer handler. The isEOFError function is part of IO library.
When an exception propagates outside the main program, the Haskell system prints the associated IOError value and exits the program.
The fail method of the IO instance of the Monad class (Section 6.3.6) raises a userError, thus:
The exceptions raised by the I/O functions in the Prelude are defined in Chapter 42.
The Foreign Function Interface (FFI) has two purposes: it enables (1) to describe in Haskell the interface to foreign language functionality and (2) to use from foreign code Haskell routines. More generally, its aim is to support the implementation of programs in a mixture of Haskell and other languages such that the source code is portable across different implementations of Haskell and non-Haskell systems as well as independent of the architecture and operating system.
The Haskell FFI currently only specifies the interaction between Haskell code and foreign code that follows the C calling convention. However, the design of the FFI is such that it enables the modular extension of the present definition to include the calling conventions of other programming languages, such as C++ and Java. A precise definition of the support for those languages is expected to be included in later versions of the language. The second major omission is the definition of the interaction with multithreading in the foreign language and, in particular, the treatment of thread-local state, and so these details are currently implementation-defined.
The core of the present specification is independent of the foreign language that is used in conjunction with Haskell. However, there are two areas where FFI specifications must become language specific: (1) the specification of external names and (2) the marshalling of the basic types of a foreign language. As an example of the former, consider that in C [9] a simple identifier is sufficient to identify an object, while Java [5], in general, requires a qualified name in conjunction with argument and result types to resolve possible overloading. Regarding the second point, consider that many languages do not specify the exact representation of some basic types. For example the type int in C may be 16, 32, or 64 bits wide. Similarly, Haskell guarantees only that Int covers at least the range [−229,229 − 1] (Section 6.4). As a consequence, to reliably represent values of C’s int in Haskell, we have to introduce a new type CInt, which is guaranteed to match the representation of int.
The specification of external names, dependent on a calling convention, is described in Section 8.5, whereas the marshalling of the basic types in dependence on a foreign language is described in Section 8.6.
For a given Haskell system, we define the Haskell context to be the execution context of the abstract machine on which the Haskell system is based. This includes the heap, stacks, and the registers of the abstract machine and their mapping onto a concrete architecture. We call any other execution context an external context. Generally, we cannot assume any compatibility between the data formats and calling conventions between the Haskell context and a given external context, except where Haskell explicitly prescribes a specific data format.
The principal goal of a foreign function interface is to provide a programmable interface between the Haskell context and external contexts. As a result Haskell threads can access data in external contexts and invoke functions that are executed in an external context as well as vice versa. In the rest of this definition, external contexts are usually identified by a calling convention.
Given that many external languages support static types, the question arises whether the consistency of Haskell types with the types of the external language can be enforced for foreign functions. Unfortunately, this is, in general, not possible without a significant investment on the part of the implementor of the Haskell system (i.e., without implementing a dedicated type checker). For example, in the case of the C calling convention, the only other approach would be to generate a C prototype from the Haskell type and leave it to the C compiler to match this prototype with the prototype that is specified in a C header file for the imported function. However, the Haskell type is lacking some information that would be required to pursue this route. In particular, the Haskell type does not contain any information as to when const modifiers have to be emitted.
As a consequence, this definition does not require the Haskell system to check consistency with foreign types. Nevertheless, Haskell systems are encouraged to provide any cross language consistency checks that can be implemented with reasonable effort.
The FFI reserves a single keyword foreign, and a set of special identifiers. The latter have a special meaning only within foreign declarations, but may be used as ordinary identifiers elsewhere.
The special identifiers ccall, cplusplus, dotnet, jvm, and stdcall are defined to denote calling conventions. However, a concrete implementation of the FFI is free to support additional, system-specific calling conventions whose name is not explicitly listed here.
To refer to objects of an external C context, we introduce the following phrases:
chname | → | {chchar} . h | (C header filename) |
cid | → | letter {letter | ascDigit} | (C identifier) |
chchar | → | letter | ascSymbol⟨&⟩ | |
letter | → | ascSmall | ascLarge | _ |
The range of lexemes that are admissible for chname is a subset of those permitted as arguments to the #include directive in C. In particular, a file name chname must end in the suffix .h. The lexemes produced by cid coincide with those allowed as C identifiers, as specified in [9].
The syntax of foreign declarations is as follows:
topdecl | → | foreign fdecl | |
fdecl | → | import callconv [safety] impent var :: ftype | (define variable) |
| | export callconv expent var :: ftype | (expose variable) | |
callconv | → | ccall | stdcall | cplusplus | (calling convention) |
| | jvm | dotnet | ||
| | system-specific calling conventions | ||
impent | → | [string] | |
expent | → | [string] | |
safety | → | unsafe | safe |
There are two flavours of foreign declarations: import and export declarations. An import declaration makes an external entity, i.e., a function or memory location defined in an external context, available in the Haskell context. Conversely, an export declaration defines a function of the Haskell context as an external entity in an external context. Consequently, the two types of declarations differ in that an import declaration defines a new variable, whereas an export declaration uses a variable that is already defined in the Haskell module.
The external context that contains the external entity is determined by the calling convention given in the foreign declaration. Consequently, the exact form of the specification of the external entity is dependent on both the calling convention and on whether it appears in an import declaration (as impent) or in an export declaration (as expent). To provide syntactic uniformity in the presence of different calling conventions, it is guaranteed that the description of an external entity lexically appears as a Haskell string lexeme. The only exception is where this string would be the empty string (i.e., be of the form ""); in this case, the string may be omitted in its entirety.
The binary interface to an external entity on a given architecture is determined by a calling convention. It often depends on the programming language in which the external entity is implemented, but usually is more dependent on the system for which the external entity has been compiled.
As an example of how the calling convention is dominated by the system rather than the programming language, consider that an entity compiled to byte code for the Java Virtual Machine (JVM) [11] needs to be invoked by the rules of the JVM rather than that of the source language in which it is implemented (the entity might be implemented in Oberon, for example).
Any implementation of the Haskell FFI must at least implement the C calling convention denoted by ccall. All other calling conventions are optional. Generally, the set of calling conventions is open, i.e., individual implementations may elect to support additional calling conventions. In addition to ccall, Table 8.1 specifies a range of identifiers for common calling conventions.
Identifier | Represented calling convention |
ccall | Calling convention of the standard C compiler on a system |
cplusplus | Calling convention of the standard C++ compiler on a system |
dotnet | Calling convention of the .net platform |
jvm | Calling convention of the Java Virtual Machine |
stdcall | Calling convention of the Win32 API (matches Pascal conventions) |
Implementations need not implement all of these conventions, but if any is implemented, it must use the listed name. For any other calling convention, implementations are free to choose a suitable name.
Only the semantics of the calling conventions ccall and stdcall are defined herein; more calling conventions may be added in future versions of Haskell.
It should be noted that the code generated by a Haskell system to implement a particular calling convention may vary widely with the target code of that system. For example, the calling convention jvm will be trivial to implement for a Haskell compiler generating Java code, whereas for a Haskell compiler generating C code, the Java Native Interface (JNI) [10] has to be targeted.
The following types constitute the set of basic foreign types:
A Haskell system that implements the FFI needs to be able to pass these types between the Haskell and the external context as function arguments and results.
Foreign types are produced according to the following grammar:
ftype | → | frtype | |
| | fatype -> ftype | ||
frtype | → | fatype | |
| | () | ||
fatype | → | qtycon atype1 … atypek | (k ≥ 0) |
A foreign type is the Haskell type of an external entity. Only a subset of Haskell’s types are permissible as foreign
types, as only a restricted set of types can be canonically transferred between the Haskell context and an external
context. A foreign type has the form
at1 -> -> atn -> rt
where n ≥ 0. It implies that the arity of the external entity is n.
External functions are strict in all arguments.
Marshallable foreign types. The argument types ati produced by fatype must be marshallable foreign types; that is, either
newtype T a1 … an = N t
and
Consequently, in order for a type defined by newtype to be used in a foreign declaration outside of the module that defines it, the type must not be exported abstractly. The module Foreign.C.Types that defines the Haskell equivalents for C types follows this convention; see Chapter 28.
Marshallable foreign result types. The result type rt produced by frtype must be a marshallable foreign result type; that is, either
newtype T a1 … an = N t
and
Generally, an import declaration has the form foreign import c e v :: t which declares the variable v of type t to be defined externally. Moreover, it specifies that v is evaluated by executing the external entity identified by the string e using calling convention c. The precise form of e depends on the calling convention and is detailed in Section 8.5. If a variable v is defined by an import declaration, no other top-level declaration for v is allowed in the same module. For example, the declaration
introduces the function cstrlen, which invokes the external function strlen using the standard C calling convention. Some external entities can be imported as pure functions; for example,
Such a declaration asserts that the external entity is a true function; i.e., when applied to the same argument values, it always produces the same result.
Whether a particular form of external entity places a constraint on the Haskell type with which it can be imported is defined in Section 8.5. Although, some forms of external entities restrict the set of Haskell types that are permissible, the system can generally not guarantee the consistency between the Haskell type given in an import declaration and the argument and result types of the external entity. It is the responsibility of the programmer to ensure this consistency.
Optionally, an import declaration can specify, after the calling convention, the safety level that should be used when invoking an external entity. A safe call is less efficient, but guarantees to leave the Haskell system in a state that allows callbacks from the external code. In contrast, an unsafe call, while carrying less overhead, must not trigger a callback into the Haskell system. If it does, the system behaviour is undefined. The default for an invocation is to be safe. Note that a callback into the Haskell system implies that a garbage collection might be triggered after an external entity was called, but before this call returns. Consequently, objects other than stable pointers (cf. Section 36) may be moved or garbage collected by the storage manager.
The general form of export declarations is foreign export c e v :: t Such a declaration enables external access to v, which may be a value, field name, or class method that is declared on the top-level of the same module or imported. Moreover, the Haskell system defines the external entity described by the string e, which may be used by external code using the calling convention c; an external invocation of the external entity e is translated into evaluation of v. The type t must be an instance of the type of v. For example, we may have
If an evaluation triggered by an external invocation of an exported Haskell value returns with an exception, the system behaviour is undefined. Thus, Haskell exceptions have to be caught within Haskell and explicitly marshalled to the foreign code.
Each foreign declaration has to specify the external entity that is accessed or provided by that declaration. The syntax and semantics of the notation that is required to uniquely determine an external entity depends heavily on the calling convention by which this entity is accessed. For example, for the calling convention ccall, a global label is sufficient. However, to uniquely identify a method in the calling convention jvm, type information has to be provided. For the latter, there is a choice between the Java source-level syntax of types and the syntax expected by JNI—but, clearly, the syntax of the specification of an external entity depends on the calling convention and may be non-trivial.
Consequently, the FFI does not fix a general syntax for denoting external entities, but requires both impent and expent to take the form of a Haskell string literal. The formation rules for the values of these strings depend on the calling convention and a Haskell system implementing a particular calling convention will have to parse these strings in accordance with the calling convention.
Defining impent and expent to take the form of a string implies that all information that is needed to statically analyse the Haskell program is separated from the information needed to generate the code interacting with the foreign language. This is, in particular, helpful for tools processing Haskell source code. When ignoring the entity information provided by impent or expent, foreign import and export declarations are still sufficient to infer identifier definition and use information as well as type information.
For more complex calling conventions, there is a choice between the user-level syntax for identifying entities (e.g., Java or C++) and the system-level syntax (e.g., the type syntax of JNI or mangled C++, respectively). If such a choice exists, the user-level syntax is preferred. Not only because it is more user friendly, but also because the system-level syntax may not be entirely independent of the particular implementation of the foreign language.
The following defines the syntax for specifying external entities and their semantics for the calling conventions ccall and stdcall. Other calling conventions from Table 8.1 are expected to be defined in future versions of Haskell.
The following defines the structure of external entities for foreign declarations under the ccall calling convention for both import and export declarations separately. Afterwards additional constraints on the type of foreign functions are defined.
The FFI covers only access to C functions and global variables. There are no mechanisms to access other entities of C programs. In particular, there is no support for accessing pre-processor symbols from Haskell, which includes #defined constants. Access from Haskell to such entities is the domain of language-specific tools, which provide added convenience over the plain FFI as defined here.
Import Declarations For import declarations, the syntax for the specification of external entities under the ccall calling convention is as follows:
impent | → | " [static] [chname] [&] [cid] " | (static function or address) |
| | " dynamic " | (stub factory importing addresses) | |
| | " wrapper " | (stub factory exporting thunks) |
The first alternative either imports a static function cid or, if & precedes the identifier, a static address. If cid is omitted, it defaults to the name of the imported Haskell variable. The optional filename chname specifies a C header file, where the intended meaning is that the header file declares the C entity identified by cid. In particular, when the Haskell system compiles Haskell to C code, the directive
#include "chname"
needs to be placed into any generated C file that refers to the foreign entity before the first occurrence of that entity in the generated C file.
The second and third alternative, identified by the keywords dynamic and wrapper, respectively, import stub functions that have to be generated by the Haskell system. In the case of dynamic, the stub converts C function pointers into Haskell functions; and conversely, in the case of wrapper, the stub converts Haskell thunks to C function pointers. If neither of the specifiers static, dynamic, or wrapper is given, static is assumed. The specifier static is nevertheless needed to import C routines that are named dynamic or wrapper.
It should be noted that a static foreign declaration that does not import an address (i.e., where & is not used in the specification of the external entity) always refers to a C function, even if the Haskell type is non-functional. For example,
refers to a pure C function foo with no arguments that returns an integer value. Similarly, if the type is IO CInt, the declaration refers to an impure nullary function. If a Haskell program needs to access a C variable bar of integer type,
must be used to obtain a pointer referring to the variable. The variable can be read and updated using the routines provided by the module Foreign.Storable (cf. Section 37).
Export Declarations External entities in ccall export declarations are of the form
expent | → | " [cid] " |
The optional C identifier cid defines the external name by which the exported Haskell variable is accessible in C. If it is omitted, the external name defaults to the name of the exported Haskell variable.
Constraints on Foreign Function Types In the case of import declaration, there are, depending on the kind of import declaration, constraints regarding the admissible Haskell type that the variable defined in the import may have. These constraints are specified in the following.
As an example, consider
This declaration imports the system() function whose prototype is available from stdlib.h.
As an example, consider
It imports the address of the variable errno, which is of the C type int.
As an example, consider
The stub factory mkFun converts any pointer to a C function that gets an integer value as its only argument and does not have a return value into a corresponding Haskell function.
As an example, consider
The stub factory mkCallback turns any Haskell computation of type IO () into a C function pointer that can be passed to C routines, which can call back into the Haskell context by invoking the referenced function.
Specification of Header Files A C header specified in an import declaration is always included by #include "chname". There is no explicit support for #include <chname> style inclusion. The ISO C99 [7] standard guarantees that any search path that would be used for a #include <chname> is also used for #include "chname" and it is guaranteed that these paths are searched after all paths that are unique to #include "chname". Furthermore, we require that chname ends in .h to make parsing of the specification of external entities unambiguous.
The specification of include files has been kept to a minimum on purpose. Libraries often require a multitude of include directives, some of which may be system-dependent. Any design that attempts to cover all possible configurations would introduce significant complexity. Moreover, in the current design, a custom include file can be specified that uses the standard C preprocessor features to include all relevant headers.
Header files have no impact on the semantics of a foreign call, and whether an implementation uses the header file or not is implementation-defined. However, as some implementations may require a header file that supplies a correct prototype for external functions in order to generate correct code, portable FFI code must include suitable header files.
C Argument Promotion The argument passing conventions of C are dependent on whether a function prototype for the called functions is in scope at a call site. In particular, if no function prototype is in scope, default argument promotion is applied to integral and floating types. In general, it cannot be expected from a Haskell system that it is aware of whether a given C function was compiled with or without a function prototype being in scope. For the sake of portability, we thus require that a Haskell system generally implements calls to C functions as well as C stubs for Haskell functions as if a function prototype for the called function is in scope.
This convention implies that the onus for ensuring the match between C and Haskell code is placed on the FFI user. In particular, when a C function that was compiled without a prototype is called from Haskell, the Haskell signature at the corresponding foreign import declaration must use the types after argument promotion. For example, consider the following C function definition, which lacks a prototype:
The lack of a prototype implies that a C compiler will apply default argument promotion to the parameter a, and thus, foo will expect to receive a value of type double, not float. Hence, the correct foreign import declaration is
In contrast, a C function compiled with the prototype
requires
A similar situation arises in the case of foreign export declarations that use types that would be altered under the C default argument promotion rules. When calling such Haskell functions from C, a function prototype matching the signature provided in the foreign export declaration must be in scope; otherwise, the C compiler will erroneously apply the promotion rules to all function arguments.
Note that for a C function defined to accept a variable number of arguments, all arguments beyond the explicitly typed arguments suffer argument promotion. However, because C permits the calling convention to be different for such functions, a Haskell system will, in general, not be able to make use of variable argument functions. Hence, their use is deprecated in portable code.
The specification of external entities under the stdcall calling convention is identical to that for standard C calls. The two calling conventions only differ in the generated code.
In addition to the language extension discussed in previous sections, the FFI includes a set of standard libraries, which ease portable use of foreign functions as well as marshalling of compound structures. Generally, the marshalling of Haskell structures into a foreign representation and vice versa can be implemented in either Haskell or the foreign language. At least where the foreign language is at a significantly lower level, e.g. C, there are good reasons for doing the marshalling in Haskell:
Consequently, the Haskell FFI emphasises Haskell-side marshalling.
The interface to the marshalling libraries is provided by the module Foreign (Chapter 24) plus a language-dependent module per supported language. In particular, the standard requires the availability of the module Foreign.C (Chapter 25), which simplifies portable interfacing with external C code. Language-dependent modules, such as Foreign.C, generally provide Haskell types representing the basic types of the foreign language using a representation that is compatible with the foreign types as implemented by the default implementation of the foreign language on the present architecture. This is especially important for languages where the standard leaves some aspects of the implementation of basic types open. For example, in C, the size of the various integral types is not fixed. Thus, to represent C interfaces faithfully in Haskell, for each integral type in C, we need to have an integral type in Haskell that is guaranteed to have the same size as the corresponding C type.
C symbol | Haskell symbol | Constraint on concrete C type |
HsChar | Char | integral type |
HsInt | Int | signed integral type, ≥ 30 bit |
HsInt8 | Int8 | signed integral type, 8 bit; int8_t if available |
HsInt16 | Int16 | signed integral type, 16 bit; int16_t if available |
HsInt32 | Int32 | signed integral type, 32 bit; int32_t if available |
HsInt64 | Int64 | signed integral type, 64 bit; int64_t if available |
HsWord8 | Word8 | unsigned integral type, 8 bit; uint8_t if available |
HsWord16 | Word16 | unsigned integral type, 16 bit; uint16_t if available |
HsWord32 | Word32 | unsigned integral type, 32 bit; uint32_t if available |
HsWord64 | Word64 | unsigned integral type, 64 bit; uint64_t if available |
HsFloat | Float | floating point type |
HsDouble | Double | floating point type |
HsBool | Bool | int |
HsPtr | Ptr a | (void ⋆) |
HsFunPtr | FunPtr a | (void (⋆)(void)) |
HsStablePtr | StablePtr a | (void ⋆) |
CPP symbol | Haskell value | Description |
HS_CHAR_MIN | minBound :: Char |
|
HS_CHAR_MAX | maxBound :: Char |
|
HS_INT_MIN | minBound :: Int |
|
HS_INT_MAX | maxBound :: Int |
|
HS_INT8_MIN | minBound :: Int8 |
|
HS_INT8_MAX | maxBound :: Int8 |
|
HS_INT16_MIN | minBound :: Int16 |
|
HS_INT16_MAX | maxBound :: Int16 |
|
HS_INT32_MIN | minBound :: Int32 |
|
HS_INT32_MAX | maxBound :: Int32 |
|
HS_INT64_MIN | minBound :: Int64 |
|
HS_INT64_MAX | maxBound :: Int64 |
|
HS_WORD8_MAX | maxBound :: Word8 |
|
HS_WORD16_MAX | maxBound :: Word16 |
|
HS_WORD32_MAX | maxBound :: Word32 |
|
HS_WORD64_MAX | maxBound :: Word64 |
|
HS_FLOAT_RADIX | floatRadix :: Float |
|
HS_FLOAT_ROUND | n/a | rounding style as per [7] |
HS_FLOAT_EPSILON | n/a | difference between 1 and the least value greater than 1 as per [7] |
HS_DOUBLE_EPSILON | n/a | (as above) |
HS_FLOAT_DIG | n/a | number of decimal digits as per [7] |
HS_DOUBLE_DIG | n/a | (as above) |
HS_FLOAT_MANT_DIG | floatDigits :: Float |
|
HS_DOUBLE_MANT_DIG | floatDigits :: Double |
|
HS_FLOAT_MIN | n/a | minimum floating point number as per [7] |
HS_DOUBLE_MIN | n/a | (as above) |
HS_FLOAT_MIN_EXP | fst . floatRange :: Float |
|
HS_DOUBLE_MIN_EXP | fst . floatRange :: Double |
|
HS_FLOAT_MIN_10_EXP | n/a | minimum decimal exponent as per [7] |
HS_DOUBLE_MIN_10_EXP | n/a | (as above) |
HS_FLOAT_MAX | n/a | maximum floating point number as per [7] |
HS_DOUBLE_MAX | n/a | (as above) |
HS_FLOAT_MAX_EXP | snd . floatRange :: Float |
|
HS_DOUBLE_MAX_EXP | snd . floatRange :: Double |
|
HS_FLOAT_MAX_10_EXP | n/a | maximum decimal exponent as per [7] |
HS_DOUBLE_MAX_10_EXP | n/a | (as above) |
HS_BOOL_FALSE | False |
|
HS_BOOL_TRUE | True |
|
Every Haskell system that implements the FFI needs to provide a C header file named HsFFI.h that defines the C symbols listed in Tables 8.2 and 8.3. Table 8.2 table lists symbols that represent types together with the Haskell type that they represent and any constraints that are placed on the concrete C types that implement these symbols. When a C type HsT represents a Haskell type T, the occurrence of T in a foreign function declaration should be matched by HsT in the corresponding C function prototype. Indeed, where the Haskell system translates Haskell to C code that invokes foreignimported C routines, such prototypes need to be provided and included via the header that can be specified in external entity strings for foreign C functions (cf. Section 8.5.1); otherwise, the system behaviour is undefined. It is guaranteed that the Haskell value nullPtr is mapped to (HsPtr) NULL in C and nullFunPtr is mapped to (HsFunPtr) NULL and vice versa.
Table 8.3 contains symbols characterising the range and precision of the types from Table 8.2. Where available, the table states the corresponding Haskell values. All C symbols, with the exception of HS_FLOAT_ROUND are constants that are suitable for use in #if preprocessing directives. Note that there is only one rounding style (HS_FLOAT_ROUND) and one radix (HS_FLOAT_RADIX), as this is all that is supported by ISO C [7].
Moreover, an implementation that does not support 64 bit integral types on the C side should implement HsInt64 and HsWord64 as a structure. In this case, the bounds HS_INT64_MIN, HS_INT64_MAX, and HS_WORD64_MAX are undefined.
In addition, to the symbols from Table 8.2 and 8.3, the header HsFFI.h must also contain the following prototypes:
These routines are useful for mixed language programs, where the main application is implemented in a foreign language that accesses routines implemented in Haskell. The function hs_init() initialises the Haskell system and provides it with the available command line arguments. Upon return, the arguments solely intended for the Haskell runtime system are removed (i.e., the values that argc and argv point to may have changed). This function must be called during program startup before any Haskell function is invoked; otherwise, the system behaviour is undefined. Conversely, the Haskell system is deinitialised by a call to hs_exit(). Multiple invocations of hs_init() are permitted, provided that they are followed by an equal number of calls to hs_exit() and that the first call to hs_exit() is after the last call to hs_init(). In addition to nested calls to hs_init(), the Haskell system may be de-initialised with hs_exit() and be re-initialised with hs_init() at a later point in time. This ensures that repeated initialisation due to multiple libraries being implemented in Haskell is covered.
The Haskell system will ignore the command line arguments passed to the second and any following calls to hs_init(). Moreover, hs_init() may be called with NULL for both argc and argv, signalling the absence of command line arguments.
The function hs_set_argv() sets the values returned by the functions getProgName and getArgs of the module System.Environment (Section 39). This function may only be invoked after hs_init(). Moreover, if hs_set_argv() is called at all, this call must precede the first invocation of getProgName and getArgs. Note that the separation of hs_init() and hs_set_argv() is essential in cases where in addition to the Haskell system other libraries that process command line arguments during initialisation are used.
The function hs_perform_gc() advises the Haskell storage manager to perform a garbage collection, where the storage manager makes an effort to releases all unreachable objects. This function must not be invoked from C functions that are imported unsafe into Haskell code nor may it be used from a finalizer.
Finally, hs_free_stable_ptr() and hs_free_fun_ptr() are the C counterparts of the Haskell functions freeStablePtr and freeHaskellFunPtr.
In this chapter the entire Haskell Prelude is given. It constitutes a specification for the Prelude. Many of the definitions are written with clarity rather than efficiency in mind, and it is not required that the specification be implemented as shown here.
The default method definitions, given with class declarations, constitute a specification only of the default method. They do not constitute a specification of the meaning of the method in all instances. To take one particular example, the default method for enumFrom in class Enum will not work properly for types whose range exceeds that of Int (because fromEnum cannot map all values in the type to distinct Int values).
The Prelude shown here is organized into a root module, Prelude, and three sub-modules, PreludeList, PreludeText, and PreludeIO. This structure is purely presentational. An implementation is not required to use this organisation for the Prelude, nor are these three modules available for import separately. Only the exports of module Prelude are significant.
Some of these modules import Library modules, such as Data.Char, Control.Monad, System.IO, and Numeric. These modules are described fully in Part II. These imports are not, of course, part of the specification of the Prelude. That is, an implementation is free to import more, or less, of the Library modules, as it pleases.
Primitives that are not definable in Haskell, indicated by names starting with “prim”, are defined in a system dependent manner in module PreludeBuiltin and are not shown here. Instance declarations that simply bind primitives to class methods are omitted. Some of the more verbose instances with obvious functionality have been left out for the sake of brevity.
Declarations for special types such as Integer, or () are included in the Prelude for completeness even though the declaration may be incomplete or syntactically invalid. An ellipsis “...” is often used in places where the remainder of a definition cannot be given in Haskell.
To reduce the occurrence of unexpected ambiguity errors, and to improve efficiency, a number of commonly-used functions over lists use the Int type rather than using a more general numeric type, such as Integral a or Num a. These functions are: take, drop, !!, length, splitAt, and replicate. The more general versions are given in the Data.List library, with the prefix “generic”; for example genericLength.
These notational conventions are used for presenting syntax:
[pattern] | optional |
{pattern} | zero or more repetitions |
(pattern) | grouping |
pat1 | pat2 | choice |
pat⟨pat′⟩ | difference—elements generated by pat |
except those generated by pat′ |
|
fibonacci | terminal syntax in typewriter font |
BNF-like syntax is used throughout, with productions having the form:
nonterm | → | alt1 | alt2 | … | altn |
In both the lexical and the context-free syntax, there are some ambiguities that are to be resolved by making grammatical phrases as long as possible, proceeding from left to right (in shift-reduce parsing, resolving shift/reduce conflicts by shifting). In the lexical syntax, this is the “maximal munch” rule. In the context-free syntax, this means that conditionals, let-expressions, and lambda abstractions extend to the right as far as possible.
program | → | { lexeme | whitespace } |
lexeme | → | qvarid | qconid | qvarsym | qconsym |
| | literal | special | reservedop | reservedid | |
literal | → | integer | float | char | string |
special | → | ( | ) | , | ; | [ | ] | ` | { | } |
whitespace | → | whitestuff {whitestuff} |
whitestuff | → | whitechar | comment | ncomment |
whitechar | → | newline | vertab | space | tab | uniWhite |
newline | → | return linefeed | return | linefeed | formfeed |
return | → | a carriage return |
linefeed | → | a line feed |
vertab | → | a vertical tab |
formfeed | → | a form feed |
space | → | a space |
tab | → | a horizontal tab |
uniWhite | → | any Unicode character defined as whitespace |
comment | → | dashes [ any⟨symbol⟩ {any} ] newline |
dashes | → | -- {-} |
opencom | → | {- |
closecom | → | -} |
ncomment | → | opencom ANY seq {ncomment ANY seq} closecom |
ANY seq | → | {ANY }⟨{ANY } ( opencom | closecom ) {ANY }⟩ |
ANY | → | graphic | whitechar |
any | → | graphic | space | tab |
graphic | → | small | large | symbol | digit | special | " | ' |
small | → | ascSmall | uniSmall | _ |
ascSmall | → | a | b | … | z |
uniSmall | → | any Unicode lowercase letter |
large | → | ascLarge | uniLarge |
ascLarge | → | A | B | … | Z |
uniLarge | → | any uppercase or titlecase Unicode letter |
symbol | → | ascSymbol | uniSymbol⟨special | _ | " | '⟩ |
ascSymbol | → | ! | # | $ | % | & | ⋆ | + | . | / | < | = | > | ? | @ |
| | \ | ^ | | | - | ~ | : | |
uniSymbol | → | any Unicode symbol or punctuation |
digit | → | ascDigit | uniDigit |
ascDigit | → | 0 | 1 | … | 9 |
uniDigit | → | any Unicode decimal digit |
octit | → | 0 | 1 | … | 7 |
hexit | → | digit | A | … | F | a | … | f |
varid | → | (small {small | large | digit | ' })⟨reservedid⟩ | |
conid | → | large {small | large | digit | ' } | |
reservedid | → | case | class | data | default | deriving | do | else | |
| | foreign | if | import | in | infix | infixl | ||
| | infixr | instance | let | module | newtype | of | ||
| | then | type | where | _ | ||
varsym | → | ( symbol⟨:⟩ {symbol} )⟨reservedop | dashes⟩ | |
consym | → | ( : {symbol})⟨reservedop⟩ | |
reservedop | → | .. | : | :: | = | \ | | | <- | -> | @ | ~ | => | |
varid | (variables) | ||
conid | (constructors) | ||
tyvar | → | varid | (type variables) |
tycon | → | conid | (type constructors) |
tycls | → | conid | (type classes) |
modid | → | {conid .} conid | (modules) |
qvarid | → | [ modid . ] varid | |
qconid | → | [ modid . ] conid | |
qtycon | → | [ modid . ] tycon | |
qtycls | → | [ modid . ] tycls | |
qvarsym | → | [ modid . ] varsym | |
qconsym | → | [ modid . ] consym | |
decimal | → | digit{digit} | |
octal | → | octit{octit} | |
hexadecimal | → | hexit{hexit} | |
integer | → | decimal | |
| | 0o octal | 0O octal | ||
| | 0x hexadecimal | 0X hexadecimal | ||
float | → | decimal . decimal [exponent] | |
| | decimal exponent | ||
exponent | → | (e | E) [+ | -] decimal | |
char | → | ' (graphic⟨' | \⟩ | space | escape⟨\&⟩) ' | |
string | → | " {graphic⟨" | \⟩ | space | escape | gap} " | |
escape | → | \ ( charesc | ascii | decimal | o octal | x hexadecimal ) | |
charesc | → | a | b | f | n | r | t | v | \ | " | ' | & | |
ascii | → | ^cntrl | NUL | SOH | STX | ETX | EOT | ENQ | ACK | |
| | BEL | BS | HT | LF | VT | FF | CR | SO | SI | DLE | ||
| | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | ||
| | EM | SUB | ESC | FS | GS | RS | US | SP | DEL | ||
cntrl | → | ascLarge | @ | [ | \ | ] | ^ | _ | |
gap | → | \ whitechar {whitechar} \ |
Section 2.7 gives an informal discussion of the layout rule. This section defines it more precisely.
The meaning of a Haskell program may depend on its layout. The effect of layout on its meaning can be completely described by adding braces and semicolons in places determined by the layout. The meaning of this augmented program is now layout insensitive.
The effect of layout is specified in this section by describing how to add braces and semicolons to a laid-out program. The specification takes the form of a function L that performs the translation. The input to L is:
There is no < n > inserted before the \Bill, because it is not the beginning of a complete lexeme; nor before the ,, because it is not preceded only by white space.)
The “indentation” of a lexeme is the column number of the first character of that lexeme; the indentation of a line is the indentation of its leftmost lexeme. To determine the column number, assume a fixed-width font with the following conventions:
For the purposes of the layout rule, Unicode characters in a source program are considered to be of the same, fixed, width as an ASCII character. However, to avoid visual confusion, programmers should avoid writing programs in which the meaning of implicit layout depends on the width of non-space characters.
The application L tokens [] delivers a layout-insensitive translation of tokens, where tokens is the result of lexically analysing a module and adding column-number indicators to it as described above. The definition of L is as follows, where we use “:” as a stream construction operator, and “[]” for the empty stream.
L (< n >: ts) (m : ms) | = | ; : (L ts (m : ms)) | if m = n |
= | } : (L (< n >: ts) ms) | if n < m |
|
L (< n >: ts) ms | = | L ts ms |
|
L ({n} : ts) (m : ms) | = | { : (L ts (n : m : ms)) | if n > m (Note 1) |
L ({n} : ts) [] | = | { : (L ts [n]) | if n > 0 (Note 1) |
L ({n} : ts) ms | = | { : } : (L (< n >: ts) ms) | (Note 2) |
L (} : ts) (0 : ms) | = | } : (L ts ms) | (Note 3) |
L (} : ts) ms | = | parse-error | (Note 3) |
L ({ : ts) ms | = | { : (L ts (0 : ms)) | (Note 4) |
L (t : ts) (m : ms) | = | } : (L (t : ts) ms) | if m∕ = 0 and parse-error(t) |
(Note 5) |
|||
L (t : ts) ms | = | t : (L ts ms) |
|
L [] [] | = | [] |
|
L [] (m : ms) | = | } : L [] ms | if m≠0 (Note 6) |
Here, the definition of p is indented less than the indentation of the enclosing context, which is set in this case by the definition of h.
The test m∕ = 0 checks that an implicitly-added closing brace would match an implicit open brace.
If none of the rules given above matches, then the algorithm fails. It can fail for instance when the end of the input is reached, and a non-layout context is active, since the close brace is missing. Some error conditions are not detected by the algorithm, although they could be: for example let }.
Note 1 implements the feature that layout processing can be stopped prematurely by a parse error. For example
is valid, because it translates to
The close brace is inserted due to the parse error rule above.
The “literate comment” convention, first developed by Richard Bird and Philip Wadler for Orwell, and inspired in turn by Donald Knuth’s “literate programming”, is an alternative style for encoding Haskell source code. The literate style encourages comments by making them the default. A line in which “>” is the first character is treated as part of the program; all other lines are comments.
The program text is recovered by taking only those lines beginning with “>”, and replacing the leading “>” with a space. Layout and comments apply exactly as described in Chapter 10 in the resulting text.
To capture some cases where one omits an “>” by mistake, it is an error for a program line to appear adjacent to a non-blank comment line, where a line is taken as blank if it consists only of whitespace.
By convention, the style of comment is indicated by the file extension, with “.hs” indicating a usual Haskell file and “.lhs” indicating a literate Haskell file. Using this style, a simple factorial program would be:
An alternative style of literate programming is particularly suitable for use with the LaTeX text processing system. In this convention, only those parts of the literate program that are entirely enclosed between \begin{code}…\end{code} delimiters are treated as program text; all other lines are comments. More precisely:
It is not necessary to insert additional blank lines before or after these delimiters, though it may be stylistically desirable. For example,
This style uses the same file extension. It is not advisable to mix these two styles in the same file.
module | → | module modid [exports] where body | |
| | body | ||
body | → | { impdecls ; topdecls } | |
| | { impdecls } | ||
| | { topdecls } | ||
impdecls | → | impdecl1 ; … ; impdecln | (n ≥ 1) |
exports | → | ( export1 , … , exportn [ , ] ) | (n ≥ 0) |
export | → | qvar | |
| | qtycon [(..) | ( cname1 , … , cnamen )] | (n ≥ 0) | |
| | qtycls [(..) | ( qvar1 , … , qvarn )] | (n ≥ 0) | |
| | module modid | ||
impdecl | → | import [qualified] modid [as modid] [impspec] | |
| | (empty declaration) | ||
impspec | → | ( import1 , … , importn [ , ] ) | (n ≥ 0) |
| | hiding ( import1 , … , importn [ , ] ) | (n ≥ 0) | |
import | → | var | |
| | tycon [ (..) | ( cname1 , … , cnamen )] | (n ≥ 0) | |
| | tycls [(..) | ( var1 , … , varn )] | (n ≥ 0) | |
cname | → | var | con | |
topdecls | → | topdecl1 ; … ; topdecln | (n ≥ 0) |
topdecl | → | type simpletype = type | |
| | data [context =>] simpletype [= constrs] [deriving] | ||
| | newtype [context =>] simpletype = newconstr [deriving] | ||
| | class [scontext =>] tycls tyvar [where cdecls] | ||
| | instance [scontext =>] qtycls inst [where idecls] | ||
| | default (type1 , … , typen) | (n ≥ 0) | |
| | foreign fdecl | ||
| | decl | ||
decls | → | { decl1 ; … ; decln } | (n ≥ 0) |
decl | → | gendecl | |
| | (funlhs | pat) rhs | ||
cdecls | → | { cdecl1 ; … ; cdecln } | (n ≥ 0) |
cdecl | → | gendecl | |
| | (funlhs | var) rhs | ||
idecls | → | { idecl1 ; … ; idecln } | (n ≥ 0) |
idecl | → | (funlhs | var) rhs | |
| | (empty) | ||
gendecl | → | vars :: [context =>] type | (type signature) |
| | fixity [integer] ops | (fixity declaration) | |
| | (empty declaration) | ||
ops | → | op1 , … , opn | (n ≥ 1) |
vars | → | var1 , …, varn | (n ≥ 1) |
fixity | → | infixl | infixr | infix | |
type | → | btype [-> type] | (function type) |
btype | → | [btype] atype | (type application) |
atype | → | gtycon | |
| | tyvar | ||
| | ( type1 , … , typek ) | (tuple type, k ≥ 2) | |
| | [ type ] | (list type) | |
| | ( type ) | (parenthesized constructor) | |
gtycon | → | qtycon | |
| | () | (unit type) | |
| | [] | (list constructor) | |
| | (->) | (function constructor) | |
| | (,{,}) | (tupling constructors) | |
context | → | class | |
| | ( class1 , … , classn ) | (n ≥ 0) | |
class | → | qtycls tyvar | |
| | qtycls ( tyvar atype1 … atypen ) | (n ≥ 1) | |
scontext | → | simpleclass | |
| | ( simpleclass1 , … , simpleclassn ) | (n ≥ 0) | |
simpleclass | → | qtycls tyvar | |
simpletype | → | tycon tyvar1 … tyvark | (k ≥ 0) |
constrs | → | constr1 | … | constrn | (n ≥ 1) |
constr | → | con [!] atype1 … [!] atypek | (arity con = k, k ≥ 0) |
| | (btype | ! atype) conop (btype | ! atype) | (infix conop) | |
| | con { fielddecl1 , … , fielddecln } | (n ≥ 0) | |
newconstr | → | con atype | |
| | con { var :: type } | ||
fielddecl | → | vars :: (type | ! atype) | |
deriving | → | deriving (dclass | (dclass1, … , dclassn)) | (n ≥ 0) |
dclass | → | qtycls | |
inst | → | gtycon | |
| | ( gtycon tyvar1 … tyvark ) | (k ≥ 0, tyvars distinct) | |
| | ( tyvar1 , … , tyvark ) | (k ≥ 2, tyvars distinct) | |
| | [ tyvar ] | ||
| | ( tyvar1 -> tyvar2 ) | tyvar1 and tyvar2 distinct | |
fdecl | → | import callconv [safety] impent var :: ftype | (define variable) |
| | export callconv expent var :: ftype | (expose variable) | |
callconv | → | ccall | stdcall | cplusplus | (calling convention) |
| | jvm | dotnet | ||
| | system-specific calling conventions | ||
impent | → | [string] | (see Section 8.5.1) |
expent | → | [string] | (see Section 8.5.1) |
safety | → | unsafe | safe | |
ftype | → | frtype | |
| | fatype → ftype | ||
frtype | → | fatype | |
| | () | ||
fatype | → | qtycon atype1 … atypek | (k ≥ 0) |
funlhs | → | var apat { apat } | |
| | pat varop pat | ||
| | ( funlhs ) apat { apat } | ||
rhs | → | = exp [where decls] | |
| | gdrhs [where decls] | ||
gdrhs | → | guards = exp [gdrhs] | |
guards | → | | guard1, …, guardn | (n ≥ 1) |
guard | → | pat <- infixexp | (pattern guard) |
| | let decls | (local declaration) | |
| | infixexp | (boolean guard) | |
exp | → | infixexp :: [context =>] type | (expression type signature) |
| | infixexp | ||
infixexp | → | lexp qop infixexp | (infix operator application) |
| | - infixexp | (prefix negation) | |
| | lexp | ||
lexp | → | \ apat1 … apatn -> exp | (lambda abstraction, n ≥ 1) |
| | let decls in exp | (let expression) | |
| | if exp [;] then exp [;] else exp | (conditional) | |
| | case exp of { alts } | (case expression) | |
| | do { stmts } | (do expression) | |
| | fexp | ||
fexp | → | [fexp] aexp | (function application) |
aexp | → | qvar | (variable) |
| | gcon | (general constructor) | |
| | literal | ||
| | ( exp ) | (parenthesized expression) | |
| | ( exp1 , … , expk ) | (tuple, k ≥ 2) | |
| | [ exp1 , … , expk ] | (list, k ≥ 1) | |
| | [ exp1 [, exp2] .. [exp3] ] | (arithmetic sequence) | |
| | [ exp | qual1 , … , qualn ] | (list comprehension, n ≥ 1) | |
| | ( infixexp qop ) | (left section) | |
| | ( qop⟨-⟩ infixexp ) | (right section) | |
| | qcon { fbind1 , … , fbindn } | (labeled construction, n ≥ 0) | |
| | aexp⟨qcon⟩ { fbind1 , … , fbindn } | (labeled update, n ≥ 1) | |
qual | → | pat <- exp | (generator) |
| | let decls | (local declaration) | |
| | exp | (guard) | |
alts | → | alt1 ; … ; altn | (n ≥ 1) |
alt | → | pat -> exp [where decls] | |
| | pat gdpat [where decls] | ||
| | (empty alternative) | ||
gdpat | → | guards -> exp [ gdpat ] | |
stmts | → | stmt1 … stmtn exp [;] | (n ≥ 0) |
stmt | → | exp ; | |
| | pat <- exp ; | ||
| | let decls ; | ||
| | ; | (empty statement) | |
fbind | → | qvar = exp | |
pat | → | lpat qconop pat | (infix constructor) |
| | lpat | ||
lpat | → | apat | |
| | - (integer | float) | (negative literal) | |
| | gcon apat1 … apatk | (arity gcon = k, k ≥ 1) | |
apat | → | var [ @ apat] | (as pattern) |
| | gcon | (arity gcon = 0) | |
| | qcon { fpat1 , … , fpatk } | (labeled pattern, k ≥ 0) | |
| | literal | ||
| | _ | (wildcard) | |
| | ( pat ) | (parenthesized pattern) | |
| | ( pat1 , … , patk ) | (tuple pattern, k ≥ 2) | |
| | [ pat1 , … , patk ] | (list pattern, k ≥ 1) | |
| | ~ apat | (irrefutable pattern) | |
fpat | → | qvar = pat | |
gcon | → | () | |
| | [] | ||
| | (,{,}) | ||
| | qcon | ||
var | → | varid | ( varsym ) | (variable) |
qvar | → | qvarid | ( qvarsym ) | (qualified variable) |
con | → | conid | ( consym ) | (constructor) |
qcon | → | qconid | ( gconsym ) | (qualified constructor) |
varop | → | varsym | ` varid ` | (variable operator) |
qvarop | → | qvarsym | ` qvarid ` | (qualified variable operator) |
conop | → | consym | ` conid ` | (constructor operator) |
qconop | → | gconsym | ` qconid ` | (qualified constructor operator) |
op | → | varop | conop | (operator) |
qop | → | qvarop | qconop | (qualified operator) |
gconsym | → | : | qconsym |
The following is an example implementation of fixity resolution for Haskell expressions. Fixity resolution also applies to Haskell patterns, but patterns are a subset of expressions so in what follows we consider only expressions for simplicity.
The function resolve takes a list in which the elements are expressions or operators, i.e. an instance of the infixexp non-terminal in the context-free grammar. It returns either Just e where e is the resolved expression, or Nothing if the input does not represent a valid expression. In a compiler, of course, it would be better to return more information about the operators involved for the purposes of producing a useful error message, but the Maybe type will suffice to illustrate the algorithm here.
The algorithm works as follows. At each stage we have a call
which means that we are looking at an expression like
(the caller holds E0). The job of parse is to build the expression to the right of op1, returning the expression and any remaining input.
There are three cases to consider:
To initialise the algorithm, we set op1 to be an imaginary operator with precedence lower than anything else. Hence parse will consume the whole input, and return the resulting expression.
The handling of the prefix negation operator, -, complicates matters only slightly. Recall that prefix negation has the same fixity as infix negation: left-associative with precedence 6. The operator to the left of -, if there is one, must have precedence lower than 6 for the expression to be legal. The negation operator itself may left-associate with operators of the same fixity (e.g. +). So for example -a + b is legal and resolves as (-a) + b, but a + -b is illegal.
The function parseNeg handles prefix negation. If we encounter a negation operator, and it is legal in this position (the operator to the left has precedence lower than 6), then we proceed in a similar way to case (3) above: compute the argument to - by recursively calling parseNeg, and then continue by calling parse.
Note that this algorithm is insensitive to the range and resolution of precedences. There is no reason in principle that Haskell should be limited to integral precedences in the range 1 to 10; a larger range, or fractional values, would present no additional difficulties.
A derived instance is an instance declaration that is generated automatically in conjunction with a data or newtype declaration. The body of a derived instance declaration is derived syntactically from the definition of the associated type. Derived instances are possible only for classes known to the compiler: those defined in either the Prelude or a standard library. In this chapter, we describe the derivation of classes defined by the Prelude.
If T is an algebraic datatype declared by:
data cx => T u1 … uk | = | K1 t11 … t1k1 | ![]() |
deriving (C1, …, Cm) |
For the purposes of derived instances, a newtype declaration is treated as a data declaration with a single constructor.
If the deriving form is present, an instance declaration is automatically generated for T u1 … uk over each class Ci. If the derived instance declaration is impossible for any of the Ci then a static error results. If no derived instances are required, the deriving form may be omitted or the form deriving () may be used.
Each derived instance declaration will have the form: instance (cx, cx′) => Ci (T u1 … uk) where { d } where d is derived automatically depending on Ci and the data type declaration for T (as will be described in the remainder of this section).
The context cx′ is the smallest context satisfying point (2) above. For mutually recusive data types, the compiler may need to perform a fixpoint calculation to compute it.
The remaining details of the derived instances for each of the derivable Prelude classes are now given. Free variables and constructors used in these translations always refer to entities defined by the Prelude.
The class methods automatically introduced by derived instances of Eq and Ord are (==), (/=), compare, (<), (<=), (>), (>=), max, and min. The latter seven operators are defined so as to compare their arguments lexicographically with respect to the constructor set given, with earlier constructors in the datatype declaration counting as smaller than later ones. For example, for the Bool datatype, we have that (True > False) == True.
Derived comparisons always traverse constructors from left to right. These examples illustrate this property:
(1,undefined) == (2,undefined) ⇒ False
(undefined,1) == (undefined,2) ⇒ ⊥
All derived operations of class Eq and Ord are strict in both arguments. For example, False <= ⊥ is ⊥, even though False is the first constructor of the Bool type.
Derived instance declarations for the class Enum are only possible for enumerations (data types with only nullary constructors).
The nullary constructors are assumed to be numbered left-to-right with the indices 0 through n − 1. The succ and pred operators give the successor and predecessor respectively of a value, under this numbering scheme. It is an error to apply succ to the maximum element, or pred to the minimum element.
The toEnum and fromEnum operators map enumerated values to and from the Int type; toEnum raises a runtime error if the Int argument is not the index of one of the constructors.
The definitions of the remaining methods are
where firstCon and lastCon are respectively the first and last constructors listed in the data declaration. For example, given the datatype:
we would have:
The Bounded class introduces the class methods minBound and maxBound, which define the minimal and maximal elements of the type. For an enumeration, the first and last constructors listed in the data declaration are the bounds. For a type with a single constructor, the constructor is applied to the bounds for the constituent types. For example, the following datatype:
would generate the following Bounded instance:
The class methods automatically introduced by derived instances of Read and Show are showsPrec, readsPrec, showList, and readList. They are used to coerce values into strings and parse strings into values.
The function showsPrec d x r accepts a precedence level d (a number from 0 to 11), a value x, and a string r. It returns a string representing x concatenated to r. showsPrec satisfies the law: showsPrec d x r ++ s == showsPrec d x (r ++ s) The representation will be enclosed in parentheses if the precedence of the top-level constructor in x is less than d. Thus, if d is 0 then the result is never surrounded in parentheses; if d is 11 it is always surrounded in parentheses, unless it is an atomic expression (recall that function application has precedence 10). The extra parameter r is essential if tree-like structures are to be printed in linear time rather than time quadratic in the size of the tree.
The function readsPrec d s accepts a precedence level d (a number from 0 to 10) and a string s, and attempts to parse a value from the front of the string, returning a list of (parsed value, remaining string) pairs. If there is no successful parse, the returned list is empty. Parsing of an un-parenthesised infix operator application succeeds only if the precedence of the operator is greater than or equal to d.
It should be the case that
(x,"") is an element of (readsPrec d (showsPrec d x ""))
That is, readsPrec should be able to parse the string produced by showsPrec, and should deliver the value that showsPrec started with.
showList and readList allow lists of objects to be represented using non-standard denotations. This is especially useful for strings (lists of Char).
readsPrec will parse any valid representation of the standard types apart from strings, for which only quoted strings are accepted, and other lists, for which only the bracketed form […] is accepted. See Chapter 9 for full details.
The result of show is a syntactically correct Haskell expression containing only constants, given the fixity declarations in force at the point where the type is declared. It contains only the constructor names defined in the data type, parentheses, and spaces. When labelled constructor fields are used, braces, commas, field names, and equal signs are also used. Parentheses are only added where needed, ignoring associativity. No line breaks are added. The result of show is readable by read if all component types are readable. (This is true for all instances defined in the Prelude but may not be true for user-defined instances.)
Derived instances of Read make the following assumptions, which derived instances of Show obey:
then:
The derived Read and Show instances may be unsuitable for some uses. Some problems include:
As a complete example, consider a tree datatype:
Automatic derivation of instance declarations for Bounded and Enum are not possible, as Tree is not an enumeration or single-constructor datatype. The complete instance declarations for Tree are shown in Figure 11.1, Note the implicit use of default class method definitions—for example, only <= is defined for Ord, with the other class methods (<, >, >=, max, and min) being defined by the defaults given in the class declaration shown in Figure 6.1.
Some compiler implementations support compiler pragmas, which are used to give additional instructions or hints to the compiler, but which do not form part of the Haskell language proper and do not change a program’s semantics. This chapter summarizes this existing practice. An implementation is not required to respect any pragma, although pragmas that are not recognised by the implementation should be ignored. Implementations are strongly encouraged to support the LANGUAGE pragma described below as there are many language extensions being used in practice.
Lexically, pragmas appear as comments, except that the enclosing syntax is {-##-}.
decl | → | {-# INLINE qvars #-} |
decl | → | {-# NOINLINE qvars #-} |
The INLINE pragma instructs the compiler to inline the specified variables at their use sites. Compilers will often automatically inline simple expressions. This may be prevented by the NOINLINE pragma.
decl | → | {-# SPECIALIZE spec1 , … , speck #-} | (k ≥ 1) |
spec | → | vars :: type |
Specialization is used to avoid inefficiencies involved in dispatching overloaded functions. For example, in
calls to factorial in which the compiler can detect that the parameter is either Int or Integer will use specialized versions of factorial which do not involve overloaded numeric operations.
The LANGUAGE pragma is a file-header pragma. A file-header pragma must precede the module keyword in a source file. There can be as many file-header pragmas as you please, and they can be preceded or followed by comments. An individual language pragma begins with the keyword LANGUAGE and is followed by a comma-separated list of named language features.
For example, to enable scoped type variables and preprocessing with CPP, if your Haskell implementation supports these extensions:
If a Haskell implementation does not recognize or support a particular language feature that a source file requests (or cannot support the combination of language features requested), any attempt to compile or otherwise use that file with that Haskell implementation must fail with an error.
In the interests of portability, multiple attempts to enable the same, supported language features (e.g. via command-line arguments, implementation-specific features dependencies or non-standard pragmas) are specifically permitted. Haskell 2010 implementations that support the LANGUAGE pragma are required to support
Those implementations are also encouraged to support the following named language features:
These are the named language extensions supported by some pre-Haskell 2010 implementations, that have been integrated into this report.
The Control.Monad module provides the Functor, Monad and MonadPlus classes, together with some useful operations on monads.
class Functor f where |
The instances of Functor for lists, Data.Maybe.Maybe and System.IO.IO satisfy these laws.
Methods
fmap :: (a -> b) -> f a -> f b |
instance Functor [] |
instance Functor IO |
instance Functor Maybe |
instance Ix i => Functor (Array i) |
class Monad m where |
Minimal complete definition: >>= and return.
Instances of Monad should satisfy the following laws:
Instances of both Monad and Functor should additionally satisfy the law:
The instances of Monad for lists, Data.Maybe.Maybe and System.IO.IO defined in the Prelude satisfy these laws.
Methods
(>>=) :: m a -> (a -> m b) -> m b |
(>>) :: m a -> m b -> m b |
return :: a -> m a |
fail :: String -> m a |
instance Monad [] |
instance Monad IO |
instance Monad Maybe |
class Monad m => MonadPlus m where |
Methods
mzero :: m a |
mplus :: m a -> m a -> m a |
instance MonadPlus [] |
instance MonadPlus Maybe |
The functions in this library use the following naming conventions:
mapM :: Monad m => (a -> m b) -> [a] -> m [b] |
mapM_ :: Monad m => (a -> m b) -> [a] -> m () |
forM :: Monad m => [a] -> (a -> m b) -> m [b] |
forM_ :: Monad m => [a] -> (a -> m b) -> m () |
sequence :: Monad m => [m a] -> m [a] |
sequence_ :: Monad m => [m a] -> m () |
(=<<) :: Monad m => (a -> m b) -> m a -> m b |
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c |
(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c |
forever :: Monad m => m a -> m b |
void :: Functor f => f a -> f () |
join :: Monad m => m (m a) -> m a |
msum :: MonadPlus m => [m a] -> m a |
filterM :: Monad m => (a -> m Bool) -> [a] -> m [a] |
mapAndUnzipM :: Monad m => (a -> m (b, c)) -> [a] -> m ([b], [c]) |
zipWithM :: Monad m => (a -> b -> m c) -> [a] -> [b] -> m [c] |
zipWithM_ :: Monad m => (a -> b -> m c) -> [a] -> [b] -> m () |
foldM :: Monad m => (a -> b -> m a) -> a -> [b] -> m a |
==
If right-to-left evaluation is required, the input list should be reversed.
foldM_ :: Monad m => (a -> b -> m a) -> a -> [b] -> m () |
replicateM :: Monad m => Int -> m a -> m [a] |
replicateM_ :: Monad m => Int -> m a -> m () |
guard :: MonadPlus m => Bool -> m () |
when :: Monad m => Bool -> m () -> m () |
will output the string Debugging\n if the Boolean value debug is True, and otherwise do nothing.
unless :: Monad m => Bool -> m () -> m () |
liftM :: Monad m => (a1 -> r) -> m a1 -> m r |
liftM2 :: Monad m => (a1 -> a2 -> r) -> m a1 -> m a2 -> m r |
liftM3 :: Monad m => (a1 -> a2 -> a3 -> r) |
-> m a1 -> m a2 -> m a3 -> m r |
liftM4 :: Monad m => (a1 -> a2 -> a3 -> a4 -> r) |
-> m a1 -> m a2 -> m a3 -> m a4 -> m r |
liftM5 :: Monad m => (a1 -> a2 -> a3 -> a4 -> a5 -> r) |
-> m a1 -> m a2 -> m a3 -> m a4 -> m a5 -> m r |
ap :: Monad m => m (a -> b) -> m a -> m b |
is equivalent to
Haskell provides indexable arrays, which may be thought of as functions whose domains are isomorphic to contiguous subsets of the integers. Functions restricted in this way can be implemented efficiently; in particular, a programmer may reasonably expect rapid access to the components. To ensure the possibility of such an implementation, arrays are treated as data, not as general functions.
Since most array functions involve the class Ix, the contents of the module Data.Ix are re-exported from Data.Array for convenience:
module Data.Ix |
data Ix i => Array i e |
instance Ix i => Functor (Array i) |
instance (Ix i, Eq e) => Eq (Array i e) |
instance (Ix i, Ord e) => Ord (Array i e) |
instance (Ix a, Read a, Read b) => Read (Array a b) |
instance (Ix a, Show a, Show b) => Show (Array a b) |
array |
:: | Ix i | |
=> | (i, i) | a pair of bounds, each of the index type of the array. These bounds are the lowest and highest indices in the array, in that order. For example, a one-origin vector of length ’10’ has bounds ’(1,10)’, and a one-origin ’10’ by ’10’ matrix has bounds ’((1,1),(10,10))’. |
-> | [(i, e)] | a list of associations of the form (index, value). Typically, this list will be expressed as a comprehension. An association ’(i, x)’ defines the value of the array at index i to be x. |
-> | Array i e | |
Construct an array with the specified bounds and containing values for given indices within these bounds.
The array is undefined (i.e. bottom) if any index in the list is out of bounds. If any two associations in the list have the same index, the value at that index is undefined (i.e. bottom).
Because the indices must be checked for these errors, array is strict in the bounds argument and in the indices of the association list, but non-strict in the values. Thus, recurrences such as the following are possible:
Not every index within the bounds of the array need appear in the association list, but the values associated with indices that do not appear will be undefined (i.e. bottom).
If, in any dimension, the lower bound is greater than the upper bound, then the array is legal, but empty. Indexing an empty array always gives an array-bounds error, but bounds still yields the bounds with which the array was constructed.
listArray :: Ix i => (i, i) -> [e] -> Array i e |
accumArray |
:: | Ix i | |
=> | (e -> a -> e) | accumulating function |
-> | e | initial value |
-> | (i, i) | bounds of the array |
-> | [(i, a)] | association list |
-> | Array i e | |
The accumArray function deals with repeated indices in the association list using an accumulating function which combines the values of associations with the same index. For example, given a list of values of some index type, hist produces a histogram of the number of occurrences of each index within a specified range:
If the accumulating function is strict, then accumArray is strict in the values, as well as the indices, in the association list. Thus, unlike ordinary arrays built with array, accumulated arrays should not in general be recursive.
(!) :: Ix i => Array i e -> i -> e |
bounds :: Ix i => Array i e -> (i, i) |
indices :: Ix i => Array i e -> [i] |
elems :: Ix i => Array i e -> [e] |
assocs :: Ix i => Array i e -> [(i, e)] |
(//) :: Ix i => Array i e -> [(i, e)] -> Array i e |
is the same matrix, except with the diagonal zeroed.
Repeated indices in the association list are handled as for array: the resulting array is undefined (i.e. bottom),
accum :: Ix i => (e -> a -> e) |
-> Array i e -> [(i, a)] -> Array i e |
ixmap :: (Ix i, Ix j) => (i, i) |
-> (i -> j) -> Array j e -> Array i e |
A similar transformation of array values may be achieved using fmap from the Array instance of the Functor class.
This module defines bitwise operations for signed and unsigned integers.
class Num a => Bits a where |
Minimal complete definition: .&., .|., xor, complement, (shift or (shiftL and shiftR)), (rotate or (rotateL and rotateR)), bitSize and isSigned.
Methods
(.&.) :: a -> a -> a |
(.|.) :: a -> a -> a |
xor :: a -> a -> a |
complement :: a -> a |
shift :: a -> Int -> a |
An instance can define either this unified shift or shiftL and shiftR, depending on which is more convenient for the type in question.
rotate :: a -> Int -> a |
For unbounded types like Integer, rotate is equivalent to shift.
An instance can define either this unified rotate or rotateL and rotateR, depending on which is more convenient for the type in question.
bit :: Int -> a |
setBit :: a -> Int -> a |
clearBit :: a -> Int -> a |
complementBit :: a -> Int -> a |
testBit :: a -> Int -> Bool |
bitSize :: a -> Int |
isSigned :: a -> Bool |
shiftL :: a -> Int -> a |
An instance can define either this and shiftR or the unified shift, depending on which is more convenient for the type in question.
shiftR :: a -> Int -> a |
An instance can define either this and shiftL or the unified shift, depending on which is more convenient for the type in question.
instance Bits Int |
instance Bits Int8 |
instance Bits Int16 |
instance Bits Int32 |
instance Bits Int64 |
instance Bits Integer |
instance Bits Word |
instance Bits Word8 |
instance Bits Word16 |
instance Bits Word32 |
instance Bits Word64 |
instance Bits WordPtr |
instance Bits IntPtr |
instance Bits CChar |
instance Bits CSChar |
instance Bits CUChar |
instance Bits CShort |
instance Bits CUShort |
instance Bits CInt |
instance Bits CUInt |
instance Bits CLong |
instance Bits CULong |
instance Bits CLLong |
instance Bits CULLong |
instance Bits CPtrdiff |
instance Bits CSize |
instance Bits CWchar |
instance Bits CSigAtomic |
instance Bits CIntPtr |
instance Bits CUIntPtr |
instance Bits CIntMax |
instance Bits CUIntMax |
data Char |
To convert a Char to or from the corresponding Int value defined by Unicode, use Prelude.toEnum and Prelude.fromEnum from the Prelude.Enum class respectively (or equivalently ord and chr).
instance Bounded Char |
instance Enum Char |
instance Eq Char |
instance Ord Char |
instance Read Char |
instance Show Char |
instance Ix Char |
instance Storable Char |
type String = [Char] |
Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).
isControl :: Char -> Bool |
isSpace :: Char -> Bool |
isLower :: Char -> Bool |
isUpper :: Char -> Bool |
isAlpha :: Char -> Bool |
isAlphaNum :: Char -> Bool |
Note that numeric digits outside the ASCII range are selected by this function but not by isDigit. Such digits may be part of identifiers but are not used by the printer and reader to represent numbers.
isPrint :: Char -> Bool |
isDigit :: Char -> Bool |
isOctDigit :: Char -> Bool |
isHexDigit :: Char -> Bool |
isLetter :: Char -> Bool |
isMark :: Char -> Bool |
isNumber :: Char -> Bool |
isPunctuation :: Char -> Bool |
isSymbol :: Char -> Bool |
isSeparator :: Char -> Bool |
isAscii :: Char -> Bool |
isLatin1 :: Char -> Bool |
isAsciiUpper :: Char -> Bool |
isAsciiLower :: Char -> Bool |
data GeneralCategory |
= | UppercaseLetter | Lu: Letter, Uppercase |
| | LowercaseLetter | Ll: Letter, Lowercase |
| | TitlecaseLetter | Lt: Letter, Titlecase |
| | ModifierLetter | Lm: Letter, Modifier |
| | OtherLetter | Lo: Letter, Other |
| | NonSpacingMark | Mn: Mark, Non-Spacing |
| | SpacingCombiningMark | Mc: Mark, Spacing Combining |
| | EnclosingMark | Me: Mark, Enclosing |
| | DecimalNumber | Nd: Number, Decimal |
| | LetterNumber | Nl: Number, Letter |
| | OtherNumber | No: Number, Other |
| | ConnectorPunctuation | Pc: Punctuation, Connector |
| | DashPunctuation | Pd: Punctuation, Dash |
| | OpenPunctuation | Ps: Punctuation, Open |
| | ClosePunctuation | Pe: Punctuation, Close |
| | InitialQuote | Pi: Punctuation, Initial quote |
| | FinalQuote | Pf: Punctuation, Final quote |
| | OtherPunctuation | Po: Punctuation, Other |
| | MathSymbol | Sm: Symbol, Math |
| | CurrencySymbol | Sc: Symbol, Currency |
| | ModifierSymbol | Sk: Symbol, Modifier |
| | OtherSymbol | So: Symbol, Other |
| | Space | Zs: Separator, Space |
| | LineSeparator | Zl: Separator, Line |
| | ParagraphSeparator | Zp: Separator, Paragraph |
| | Control | Cc: Other, Control |
| | Format | Cf: Other, Format |
| | Surrogate | Cs: Other, Surrogate |
| | PrivateUse | Co: Other, Private Use |
| | NotAssigned | Cn: Other, Not Assigned |
Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard.
instance Bounded GeneralCategory |
instance Enum GeneralCategory |
instance Eq GeneralCategory |
instance Ord GeneralCategory |
instance Read GeneralCategory |
instance Show GeneralCategory |
instance Ix GeneralCategory |
generalCategory :: Char -> GeneralCategory |
toUpper :: Char -> Char |
toLower :: Char -> Char |
toTitle :: Char -> Char |
digitToInt :: Char -> Int |
intToDigit :: Int -> Char |
ord :: Char -> Int |
chr :: Int -> Char |
showLitChar :: Char -> ShowS |
lexLitChar :: ReadS String |
readLitChar :: ReadS Char |
data RealFloat a => Complex a |
= | !a :+ !a | forms a complex number from its real and imaginary rectangular components. |
Complex numbers are an algebraic type.
For a complex number z, abs z is a number with the magnitude of z, but oriented in the positive real direction, whereas signum z has the phase of z, but unit magnitude.
instance RealFloat a => Eq (Complex a) |
instance RealFloat a => Floating (Complex a) |
instance RealFloat a => Fractional (Complex a) |
instance RealFloat a => Num (Complex a) |
instance (Read a, RealFloat a) => Read (Complex a) |
instance RealFloat a => Show (Complex a) |
realPart :: RealFloat a => Complex a -> a |
imagPart :: RealFloat a => Complex a -> a |
mkPolar :: RealFloat a => a -> a -> Complex a |
cis :: RealFloat a => a -> Complex a |
polar :: RealFloat a => Complex a -> (a, a) |
magnitude :: RealFloat a => Complex a -> a |
phase :: RealFloat a => Complex a -> a |
conjugate :: RealFloat a => Complex a -> Complex a |
This module provides signed integer types of unspecified width (Int) and fixed widths (Int8, Int16, Int32 and Int64). All arithmetic is performed modulo 2^n, where n is the number of bits in the type.
For coercing between any two integer types, use fromIntegral. Coercing word types (see Data.Word) to and from integer types preserves representation, not sign.
The rules that hold for Enum instances over a bounded type such as Int (see the section of the Haskell language report dealing with arithmetic sequences) also hold for the Enum instances over the various Int types defined here.
Right and left shifts by amounts greater than or equal to the width of the type result in either zero or -1, depending on the sign of the value being shifted. This is contrary to the behaviour in C, which is undefined; a common interpretation is to truncate the shift count to the width of the type, for example 1 << 32 == 1 in some C implementations.
data Int |
instance Bounded Int |
instance Enum Int |
instance Eq Int |
instance Integral Int |
instance Num Int |
instance Ord Int |
instance Read Int |
instance Real Int |
instance Show Int |
instance Ix Int |
instance Storable Int |
instance Bits Int |
data Int8 |
instance Bounded Int8 |
instance Enum Int8 |
instance Eq Int8 |
instance Integral Int8 |
instance Num Int8 |
instance Ord Int8 |
instance Read Int8 |
instance Real Int8 |
instance Show Int8 |
instance Ix Int8 |
instance Storable Int8 |
instance Bits Int8 |
data Int16 |
instance Bounded Int16 |
instance Enum Int16 |
instance Eq Int16 |
instance Integral Int16 |
instance Num Int16 |
instance Ord Int16 |
instance Read Int16 |
instance Real Int16 |
instance Show Int16 |
instance Ix Int16 |
instance Storable Int16 |
instance Bits Int16 |
data Int32 |
instance Bounded Int32 |
instance Enum Int32 |
instance Eq Int32 |
instance Integral Int32 |
instance Num Int32 |
instance Ord Int32 |
instance Read Int32 |
instance Real Int32 |
instance Show Int32 |
instance Ix Int32 |
instance Storable Int32 |
instance Bits Int32 |
data Int64 |
instance Bounded Int64 |
instance Enum Int64 |
instance Eq Int64 |
instance Integral Int64 |
instance Num Int64 |
instance Ord Int64 |
instance Read Int64 |
instance Real Int64 |
instance Show Int64 |
instance Ix Int64 |
instance Storable Int64 |
instance Bits Int64 |
class Ord a => Ix a where |
The first argument (l,u) of each of these operations is a pair specifying the lower and upper bounds of a contiguous subrange of values.
An implementation is entitled to assume the following laws about these operations:
Minimal complete instance: range, index and inRange.
Methods
range :: (a, a) -> [a] |
index :: (a, a) -> a -> Int |
inRange :: (a, a) -> a -> Bool |
rangeSize :: (a, a) -> Int |
instance Ix Bool |
instance Ix Char |
instance Ix Int |
instance Ix Int8 |
instance Ix Int16 |
instance Ix Int32 |
instance Ix Int64 |
instance Ix Integer |
instance Ix Ordering |
instance Ix Word |
instance Ix Word8 |
instance Ix Word16 |
instance Ix Word32 |
instance Ix Word64 |
instance Ix () |
instance Ix GeneralCategory |
instance Ix SeekMode |
instance Ix IOMode |
instance (Ix a, Ix b) => Ix (a, b) |
instance (Ix a1, Ix a2, Ix a3) => Ix (a1, a2, a3) |
instance (Ix a1, Ix a2, Ix a3, Ix a4) => Ix (a1, a2, a3, a4) |
instance (Ix a1, Ix a2, Ix a3, Ix a4, Ix a5) => Ix (a1, a2, a3, a4, a5) |
It is possible to derive an instance of Ix automatically, using a deriving clause on a data declaration. Such derived instance declarations for the class Ix are only possible for enumerations (i.e. datatypes having only nullary constructors) and single-constructor datatypes, whose constituent types are instances of Ix. A Haskell implementation must provide Ix instances for tuples up to at least size 15.
For an enumeration, the nullary constructors are assumed to be numbered left-to-right with the indices being 0 to n-1 inclusive. This is the same numbering defined by the Enum class. For example, given the datatype:
we would have:
For single-constructor datatypes, the derived instance declarations are as shown for tuples:
(++) :: [a] -> [a] -> [a] |
If the first list is not finite, the result is the first list.
head :: [a] -> a |
last :: [a] -> a |
tail :: [a] -> [a] |
init :: [a] -> [a] |
null :: [a] -> Bool |
length :: [a] -> Int |
map :: (a -> b) -> [a] -> [b] |
reverse :: [a] -> [a] |
intersperse :: a -> [a] -> [a] |
intercalate :: [a] -> [[a]] -> [a] |
transpose :: [[a]] -> [[a]] |
subsequences :: [a] -> [[a]] |
permutations :: [a] -> [[a]] |
foldl :: (a -> b -> a) -> a -> [b] -> a |
The list must be finite.
foldl' :: (a -> b -> a) -> a -> [b] -> a |
foldl1 :: (a -> a -> a) -> [a] -> a |
foldl1' :: (a -> a -> a) -> [a] -> a |
foldr :: (a -> b -> b) -> b -> [a] -> b |
foldr1 :: (a -> a -> a) -> [a] -> a |
concat :: [[a]] -> [a] |
concatMap :: (a -> [b]) -> [a] -> [b] |
and :: [Bool] -> Bool |
or :: [Bool] -> Bool |
any :: (a -> Bool) -> [a] -> Bool |
all :: (a -> Bool) -> [a] -> Bool |
sum :: Num a => [a] -> a |
product :: Num a => [a] -> a |
maximum :: Ord a => [a] -> a |
minimum :: Ord a => [a] -> a |
scanl :: (a -> b -> a) -> a -> [b] -> [a] |
Note that
scanl1 :: (a -> a -> a) -> [a] -> [a] |
scanr :: (a -> b -> b) -> b -> [a] -> [b] |
scanr1 :: (a -> a -> a) -> [a] -> [a] |
mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) |
mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) |
iterate :: (a -> a) -> a -> [a] |
repeat :: a -> [a] |
replicate :: Int -> a -> [a] |
cycle :: [a] -> [a] |
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] |
In some cases, unfoldr can undo a foldr operation:
if the following holds:
A simple use of unfoldr:
take :: Int -> [a] -> [a] |
It is an instance of the more general Data.List.genericTake, in which n may be of any integral type.
drop :: Int -> [a] -> [a] |
It is an instance of the more general Data.List.genericDrop, in which n may be of any integral type.
splitAt :: Int -> [a] -> ([a], [a]) |
It is equivalent to (take n xs, drop n xs). splitAt is an instance of the more general Data.List.genericSplitAt, in which n may be of any integral type.
takeWhile :: (a -> Bool) -> [a] -> [a] |
dropWhile :: (a -> Bool) -> [a] -> [a] |
span :: (a -> Bool) -> [a] -> ([a], [a]) |
break :: (a -> Bool) -> [a] -> ([a], [a]) |
stripPrefix :: Eq a => [a] -> [a] -> Maybe [a] |
group :: Eq a => [a] -> [[a]] |
It is a special case of groupBy, which allows the programmer to supply their own equality test.
inits :: [a] -> [[a]] |
tails :: [a] -> [[a]] |
isPrefixOf :: Eq a => [a] -> [a] -> Bool |
isSuffixOf :: Eq a => [a] -> [a] -> Bool |
isInfixOf :: Eq a => [a] -> [a] -> Bool |
Example:
elem :: Eq a => a -> [a] -> Bool |
notElem :: Eq a => a -> [a] -> Bool |
lookup :: Eq a => a -> [(a, b)] -> Maybe b |
find :: (a -> Bool) -> [a] -> Maybe a |
filter :: (a -> Bool) -> [a] -> [a] |
partition :: (a -> Bool) -> [a] -> ([a], [a]) |
These functions treat a list xs as a indexed collection, with indices ranging from 0 to length xs - 1.
(!!) :: [a] -> Int -> a |
elemIndex :: Eq a => a -> [a] -> Maybe Int |
elemIndices :: Eq a => a -> [a] -> [Int] |
findIndex :: (a -> Bool) -> [a] -> Maybe Int |
findIndices :: (a -> Bool) -> [a] -> [Int] |
zip :: [a] -> [b] -> [(a, b)] |
zip3 :: [a] -> [b] -> [c] -> [(a, b, c)] |
zip4 :: [a] -> [b] -> [c] -> [d] -> [(a, b, c, d)] |
zip5 :: [a] -> [b] -> [c] -> [d] -> [e] -> [(a, b, c, d, e)] |
zip6 :: [a] |
-> [b] -> [c] -> [d] -> [e] -> [f] -> [(a, b, c, d, e, f)] |
zip7 :: [a] |
-> [b] |
-> [c] -> [d] -> [e] -> [f] -> [g] -> [(a, b, c, d, e, f, g)] |
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] |
zipWith3 :: (a -> b -> c -> d) -> [a] -> [b] -> [c] -> [d] |
zipWith4 :: (a -> b -> c -> d -> e) |
-> [a] -> [b] -> [c] -> [d] -> [e] |
zipWith5 :: (a -> b -> c -> d -> e -> f) |
-> [a] -> [b] -> [c] -> [d] -> [e] -> [f] |
zipWith6 :: (a -> b -> c -> d -> e -> f -> g) |
-> [a] -> [b] -> [c] -> [d] -> [e] -> [f] -> [g] |
zipWith7 :: (a -> b -> c -> d -> e -> f -> g -> h) |
-> [a] -> [b] -> [c] -> [d] -> [e] -> [f] -> [g] -> [h] |
unzip :: [(a, b)] -> ([a], [b]) |
unzip3 :: [(a, b, c)] -> ([a], [b], [c]) |
unzip4 :: [(a, b, c, d)] -> ([a], [b], [c], [d]) |
unzip5 :: [(a, b, c, d, e)] -> ([a], [b], [c], [d], [e]) |
unzip6 :: [(a, b, c, d, e, f)] -> ([a], [b], [c], [d], [e], [f]) |
unzip7 :: [(a, b, c, d, e, f, g)] |
-> ([a], [b], [c], [d], [e], [f], [g]) |
lines :: String -> [String] |
words :: String -> [String] |
unlines :: [String] -> String |
unwords :: [String] -> String |
nub :: Eq a => [a] -> [a] |
delete :: Eq a => a -> [a] -> [a] |
It is a special case of deleteBy, which allows the programmer to supply their own equality test.
(\\) :: Eq a => [a] -> [a] -> [a] |
It is a special case of deleteFirstsBy, which allows the programmer to supply their own equality test.
union :: Eq a => [a] -> [a] -> [a] |
Duplicates, and elements of the first list, are removed from the the second list, but if the first list contains duplicates, so will the result. It is a special case of unionBy, which allows the programmer to supply their own equality test.
intersect :: Eq a => [a] -> [a] -> [a] |
If the first list contains duplicates, so will the result.
It is a special case of intersectBy, which allows the programmer to supply their own equality test.
sort :: Ord a => [a] -> [a] |
insert :: Ord a => a -> [a] -> [a] |
By convention, overloaded functions have a non-overloaded counterpart whose name is suffixed with ‘By’.
The predicate is assumed to define an equivalence.
nubBy :: (a -> a -> Bool) -> [a] -> [a] |
deleteBy :: (a -> a -> Bool) -> a -> [a] -> [a] |
deleteFirstsBy :: (a -> a -> Bool) -> [a] -> [a] -> [a] |
unionBy :: (a -> a -> Bool) -> [a] -> [a] -> [a] |
intersectBy :: (a -> a -> Bool) -> [a] -> [a] -> [a] |
groupBy :: (a -> a -> Bool) -> [a] -> [[a]] |
The function is assumed to define a total ordering.
sortBy :: (a -> a -> Ordering) -> [a] -> [a] |
insertBy :: (a -> a -> Ordering) -> a -> [a] -> [a] |
maximumBy :: (a -> a -> Ordering) -> [a] -> a |
minimumBy :: (a -> a -> Ordering) -> [a] -> a |
The prefix ‘generic’ indicates an overloaded function that is a generalized version of a Prelude function.
genericLength :: Num i => [b] -> i |
genericTake :: Integral i => i -> [a] -> [a] |
genericDrop :: Integral i => i -> [a] -> [a] |
genericSplitAt :: Integral i => i -> [b] -> ([b], [b]) |
genericIndex :: Integral a => [b] -> a -> b |
genericReplicate :: Integral i => i -> a -> [a] |
data Maybe a |
= | Nothing | |
| | Just a | |
The Maybe type encapsulates an optional value. A value of type Maybe a either contains a value of type a (represented as Just a), or it is empty (represented as Nothing). Using Maybe is a good way to deal with errors or exceptional cases without resorting to drastic measures such as error.
The Maybe type is also a monad. It is a simple kind of error monad, where all errors are represented by Nothing. A richer error monad can be built using the Data.Either.Either type.
instance Monad Maybe |
instance Functor Maybe |
instance MonadPlus Maybe |
instance Eq a => Eq (Maybe a) |
instance Ord a => Ord (Maybe a) |
instance Read a => Read (Maybe a) |
instance Show a => Show (Maybe a) |
maybe :: b -> (a -> b) -> Maybe a -> b |
isJust :: Maybe a -> Bool |
isNothing :: Maybe a -> Bool |
fromJust :: Maybe a -> a |
fromMaybe :: a -> Maybe a -> a |
listToMaybe :: [a] -> Maybe a |
maybeToList :: Maybe a -> [a] |
catMaybes :: [Maybe a] -> [a] |
mapMaybe :: (a -> Maybe b) -> [a] -> [b] |
data Integral a => Ratio a |
instance Integral a => Enum (Ratio a) |
instance Integral a => Eq (Ratio a) |
instance Integral a => Fractional (Ratio a) |
instance Integral a => Num (Ratio a) |
instance Integral a => Ord (Ratio a) |
instance (Integral a, Read a) => Read (Ratio a) |
instance Integral a => Real (Ratio a) |
instance Integral a => RealFrac (Ratio a) |
instance Integral a => Show (Ratio a) |
type Rational = Ratio Integer |
(%) :: Integral a => a -> a -> Ratio a |
numerator :: Integral a => Ratio a -> a |
denominator :: Integral a => Ratio a -> a |
approxRational :: RealFrac a => a -> a -> Rational |
Any real interval contains a unique simplest rational; in particular, note that 0/1 is the simplest rational of all.
This module provides unsigned integer types of unspecified width (Word) and fixed widths (Word8, Word16, Word32 and Word64). All arithmetic is performed modulo 2^n, where n is the number of bits in the type.
For coercing between any two integer types, use fromIntegral. Coercing word types to and from integer types preserves representation, not sign.
The rules that hold for Enum instances over a bounded type such as Int (see the section of the Haskell language report dealing with arithmetic sequences) also hold for the Enum instances over the various Word types defined here.
Right and left shifts by amounts greater than or equal to the width of the type result in a zero result. This is contrary to the behaviour in C, which is undefined; a common interpretation is to truncate the shift count to the width of the type, for example 1 << 32 == 1 in some C implementations.
data Word |
instance Bounded Word |
instance Enum Word |
instance Eq Word |
instance Integral Word |
instance Num Word |
instance Ord Word |
instance Read Word |
instance Real Word |
instance Show Word |
instance Ix Word |
instance Storable Word |
instance Bits Word |
data Word8 |
instance Bounded Word8 |
instance Enum Word8 |
instance Eq Word8 |
instance Integral Word8 |
instance Num Word8 |
instance Ord Word8 |
instance Read Word8 |
instance Real Word8 |
instance Show Word8 |
instance Ix Word8 |
instance Storable Word8 |
instance Bits Word8 |
data Word16 |
instance Bounded Word16 |
instance Enum Word16 |
instance Eq Word16 |
instance Integral Word16 |
instance Num Word16 |
instance Ord Word16 |
instance Read Word16 |
instance Real Word16 |
instance Show Word16 |
instance Ix Word16 |
instance Storable Word16 |
instance Bits Word16 |
data Word32 |
instance Bounded Word32 |
instance Enum Word32 |
instance Eq Word32 |
instance Integral Word32 |
instance Num Word32 |
instance Ord Word32 |
instance Read Word32 |
instance Real Word32 |
instance Show Word32 |
instance Ix Word32 |
instance Storable Word32 |
instance Bits Word32 |
data Word64 |
instance Bounded Word64 |
instance Enum Word64 |
instance Eq Word64 |
instance Integral Word64 |
instance Num Word64 |
instance Ord Word64 |
instance Read Word64 |
instance Real Word64 |
instance Show Word64 |
instance Ix Word64 |
instance Storable Word64 |
instance Bits Word64 |
The module Foreign combines the interfaces of all modules providing language-independent marshalling support, namely
module Data.Bits |
module Data.Int |
module Data.Word |
module Foreign.Ptr |
module Foreign.ForeignPtr |
module Foreign.StablePtr |
module Foreign.Storable |
module Foreign.Marshal |
The module Foreign.C combines the interfaces of all modules providing C-specific marshalling support, namely
module Foreign.C.Types |
module Foreign.C.String |
module Foreign.C.Error |
The module Foreign.C.Error facilitates C-specific error handling of errno.
newtype Errno |
= | Errno CInt | |
Haskell representation for errno values. The implementation is deliberately exposed, to allow users to add their own definitions of Errno values.
instance Eq Errno |
Different operating systems and/or C libraries often support different values of errno. This module defines the common values, but due to the open definition of Errno users may add definitions which are not predefined.
isValidErrno :: Errno -> Bool |
getErrno :: IO Errno |
resetErrno :: IO () |
errnoToIOError |
:: | String | the location where the error occurred |
-> | Errno | the error number |
-> | Maybe Handle | optional handle associated with the error |
-> | Maybe String | optional filename associated with the error |
-> | IOError | |
Construct an IOError based on the given Errno value. The optional information can be used to improve the accuracy of error messages.
throwErrno |
:: | String | textual description of the error location |
-> | IO a | |
Throw an IOError corresponding to the current value of getErrno.
throwErrnoIf |
:: | (a -> Bool) | predicate to apply to the result value of the IO operation |
-> | String | textual description of the location |
-> | IO a | the IO operation to be executed |
-> | IO a | |
Throw an IOError corresponding to the current value of getErrno if the result value of the IO action meets the given predicate.
throwErrnoIf_ :: (a -> Bool) -> String -> IO a -> IO () |
throwErrnoIfRetry :: (a -> Bool) -> String -> IO a -> IO a |
throwErrnoIfRetry_ :: (a -> Bool) -> String -> IO a -> IO () |
throwErrnoIfMinus1 :: Num a => String -> IO a -> IO a |
throwErrnoIfMinus1_ :: Num a => String -> IO a -> IO () |
throwErrnoIfMinus1Retry :: Num a => String -> IO a -> IO a |
throwErrnoIfMinus1Retry_ :: Num a => String -> IO a -> IO () |
throwErrnoIfNull :: String -> IO (Ptr a) -> IO (Ptr a) |
throwErrnoIfNullRetry :: String -> IO (Ptr a) -> IO (Ptr a) |
throwErrnoIfRetryMayBlock |
as throwErrnoIfRetry, but additionally if the operation yields the error code eAGAIN or eWOULDBLOCK, an alternative action is executed before retrying.
throwErrnoIfRetryMayBlock_ :: (a -> Bool) |
-> String -> IO a -> IO b -> IO () |
throwErrnoIfMinus1RetryMayBlock :: Num a => String |
-> IO a -> IO b -> IO a |
throwErrnoIfMinus1RetryMayBlock_ :: Num a => String |
-> IO a -> IO b -> IO () |
throwErrnoIfNullRetryMayBlock :: String |
-> IO (Ptr a) -> IO b -> IO (Ptr a) |
throwErrnoPath :: String -> FilePath -> IO a |
throwErrnoPathIf :: (a -> Bool) |
-> String -> FilePath -> IO a -> IO a |
throwErrnoPathIf_ :: (a -> Bool) |
-> String -> FilePath -> IO a -> IO () |
throwErrnoPathIfNull :: String |
-> FilePath -> IO (Ptr a) -> IO (Ptr a) |
throwErrnoPathIfMinus1 :: Num a => String |
-> FilePath -> IO a -> IO a |
throwErrnoPathIfMinus1_ :: Num a => String |
-> FilePath -> IO a -> IO () |
Utilities for primitive marshalling of C strings.
The marshalling converts each Haskell character, representing a Unicode code point, to one or more bytes in a manner that, by default, is determined by the current locale. As a consequence, no guarantees can be made about the relative length of a Haskell string and its corresponding C string, and therefore all the marshalling routines include memory allocation. The translation between Unicode and the encoding of the current locale may be lossy.
type CString = Ptr CChar |
type CStringLen = (Ptr CChar, Int) |
Currently these functions are identical to their CAString counterparts; eventually they will use an encoding determined by the current locale.
peekCString :: CString -> IO String |
peekCStringLen :: CStringLen -> IO String |
newCString :: String -> IO CString |
newCStringLen :: String -> IO CStringLen |
withCString :: String -> (CString -> IO a) -> IO a |
withCStringLen :: String -> (CStringLen -> IO a) -> IO a |
charIsRepresentable :: Char -> IO Bool |
Currently only Latin-1 characters are representable.
These variants of the above functions are for use with C libraries that are ignorant of Unicode. These functions should be used with care, as a loss of information can occur.
castCharToCChar :: Char -> CChar |
castCCharToChar :: CChar -> Char |
castCharToCUChar :: Char -> CUChar |
castCUCharToChar :: CUChar -> Char |
castCharToCSChar :: Char -> CSChar |
castCSCharToChar :: CSChar -> Char |
peekCAString :: CString -> IO String |
peekCAStringLen :: CStringLen -> IO String |
newCAString :: String -> IO CString |
newCAStringLen :: String -> IO CStringLen |
withCAString :: String -> (CString -> IO a) -> IO a |
withCAStringLen :: String -> (CStringLen -> IO a) -> IO a |
These variants of the above functions are for use with C libraries that encode Unicode using the C wchar_t type in a system-dependent way. The only encodings supported are
type CWString = Ptr CWchar |
type CWStringLen = (Ptr CWchar, Int) |
peekCWString :: CWString -> IO String |
peekCWStringLen :: CWStringLen -> IO String |
newCWString :: String -> IO CWString |
newCWStringLen :: String -> IO CWStringLen |
withCWString :: String -> (CWString -> IO a) -> IO a |
withCWStringLen :: String -> (CWStringLen -> IO a) -> IO a |
These types are needed to accurately represent C function prototypes, in order to access C library interfaces in Haskell. The Haskell system is not required to represent those types exactly as C does, but the following guarantees are provided concerning a Haskell type CT representing a C type t:
These types are are represented as newtypes of types in Data.Int and Data.Word, and are instances of Eq, Ord, Num, Read, Show, Enum, Storable, Bounded, Real, Integral and Bits.
data CChar |
instance Bounded CChar |
instance Enum CChar |
instance Eq CChar |
instance Integral CChar |
instance Num CChar |
instance Ord CChar |
instance Read CChar |
instance Real CChar |
instance Show CChar |
instance Storable CChar |
instance Bits CChar |
data CSChar |
instance Bounded CSChar |
instance Enum CSChar |
instance Eq CSChar |
instance Integral CSChar |
instance Num CSChar |
instance Ord CSChar |
instance Read CSChar |
instance Real CSChar |
instance Show CSChar |
instance Storable CSChar |
instance Bits CSChar |
data CUChar |
instance Bounded CUChar |
instance Enum CUChar |
instance Eq CUChar |
instance Integral CUChar |
instance Num CUChar |
instance Ord CUChar |
instance Read CUChar |
instance Real CUChar |
instance Show CUChar |
instance Storable CUChar |
instance Bits CUChar |
data CShort |
instance Bounded CShort |
instance Enum CShort |
instance Eq CShort |
instance Integral CShort |
instance Num CShort |
instance Ord CShort |
instance Read CShort |
instance Real CShort |
instance Show CShort |
instance Storable CShort |
instance Bits CShort |
data CUShort |
instance Bounded CUShort |
instance Enum CUShort |
instance Eq CUShort |
instance Integral CUShort |
instance Num CUShort |
instance Ord CUShort |
instance Read CUShort |
instance Real CUShort |
instance Show CUShort |
instance Storable CUShort |
instance Bits CUShort |
data CInt |
instance Bounded CInt |
instance Enum CInt |
instance Eq CInt |
instance Integral CInt |
instance Num CInt |
instance Ord CInt |
instance Read CInt |
instance Real CInt |
instance Show CInt |
instance Storable CInt |
instance Bits CInt |
data CUInt |
instance Bounded CUInt |
instance Enum CUInt |
instance Eq CUInt |
instance Integral CUInt |
instance Num CUInt |
instance Ord CUInt |
instance Read CUInt |
instance Real CUInt |
instance Show CUInt |
instance Storable CUInt |
instance Bits CUInt |
data CLong |
instance Bounded CLong |
instance Enum CLong |
instance Eq CLong |
instance Integral CLong |
instance Num CLong |
instance Ord CLong |
instance Read CLong |
instance Real CLong |
instance Show CLong |
instance Storable CLong |
instance Bits CLong |
data CULong |
instance Bounded CULong |
instance Enum CULong |
instance Eq CULong |
instance Integral CULong |
instance Num CULong |
instance Ord CULong |
instance Read CULong |
instance Real CULong |
instance Show CULong |
instance Storable CULong |
instance Bits CULong |
data CPtrdiff |
instance Bounded CPtrdiff |
instance Enum CPtrdiff |
instance Eq CPtrdiff |
instance Integral CPtrdiff |
instance Num CPtrdiff |
instance Ord CPtrdiff |
instance Read CPtrdiff |
instance Real CPtrdiff |
instance Show CPtrdiff |
instance Storable CPtrdiff |
instance Bits CPtrdiff |
data CSize |
instance Bounded CSize |
instance Enum CSize |
instance Eq CSize |
instance Integral CSize |
instance Num CSize |
instance Ord CSize |
instance Read CSize |
instance Real CSize |
instance Show CSize |
instance Storable CSize |
instance Bits CSize |
data CWchar |
instance Bounded CWchar |
instance Enum CWchar |
instance Eq CWchar |
instance Integral CWchar |
instance Num CWchar |
instance Ord CWchar |
instance Read CWchar |
instance Real CWchar |
instance Show CWchar |
instance Storable CWchar |
instance Bits CWchar |
data CSigAtomic |
instance Bounded CSigAtomic |
instance Enum CSigAtomic |
instance Eq CSigAtomic |
instance Integral CSigAtomic |
instance Num CSigAtomic |
instance Ord CSigAtomic |
instance Read CSigAtomic |
instance Real CSigAtomic |
instance Show CSigAtomic |
instance Storable CSigAtomic |
instance Bits CSigAtomic |
data CLLong |
instance Bounded CLLong |
instance Enum CLLong |
instance Eq CLLong |
instance Integral CLLong |
instance Num CLLong |
instance Ord CLLong |
instance Read CLLong |
instance Real CLLong |
instance Show CLLong |
instance Storable CLLong |
instance Bits CLLong |
data CULLong |
instance Bounded CULLong |
instance Enum CULLong |
instance Eq CULLong |
instance Integral CULLong |
instance Num CULLong |
instance Ord CULLong |
instance Read CULLong |
instance Real CULLong |
instance Show CULLong |
instance Storable CULLong |
instance Bits CULLong |
data CIntPtr |
instance Bounded CIntPtr |
instance Enum CIntPtr |
instance Eq CIntPtr |
instance Integral CIntPtr |
instance Num CIntPtr |
instance Ord CIntPtr |
instance Read CIntPtr |
instance Real CIntPtr |
instance Show CIntPtr |
instance Storable CIntPtr |
instance Bits CIntPtr |
data CUIntPtr |
instance Bounded CUIntPtr |
instance Enum CUIntPtr |
instance Eq CUIntPtr |
instance Integral CUIntPtr |
instance Num CUIntPtr |
instance Ord CUIntPtr |
instance Read CUIntPtr |
instance Real CUIntPtr |
instance Show CUIntPtr |
instance Storable CUIntPtr |
instance Bits CUIntPtr |
data CIntMax |
instance Bounded CIntMax |
instance Enum CIntMax |
instance Eq CIntMax |
instance Integral CIntMax |
instance Num CIntMax |
instance Ord CIntMax |
instance Read CIntMax |
instance Real CIntMax |
instance Show CIntMax |
instance Storable CIntMax |
instance Bits CIntMax |
data CUIntMax |
instance Bounded CUIntMax |
instance Enum CUIntMax |
instance Eq CUIntMax |
instance Integral CUIntMax |
instance Num CUIntMax |
instance Ord CUIntMax |
instance Read CUIntMax |
instance Real CUIntMax |
instance Show CUIntMax |
instance Storable CUIntMax |
instance Bits CUIntMax |
These types are are represented as newtypes of basic foreign types, and are instances of Eq, Ord, Num, Read, Show, Enum and Storable.
data CClock |
instance Enum CClock |
instance Eq CClock |
instance Num CClock |
instance Ord CClock |
instance Read CClock |
instance Real CClock |
instance Show CClock |
instance Storable CClock |
data CTime |
instance Enum CTime |
instance Eq CTime |
instance Num CTime |
instance Ord CTime |
instance Read CTime |
instance Real CTime |
instance Show CTime |
instance Storable CTime |
These types are are represented as newtypes of Float and Double, and are instances of Eq, Ord, Num, Read, Show, Enum, Storable, Real, Fractional, Floating, RealFrac and RealFloat.
data CFloat |
instance Enum CFloat |
instance Eq CFloat |
instance Floating CFloat |
instance Fractional CFloat |
instance Num CFloat |
instance Ord CFloat |
instance Read CFloat |
instance Real CFloat |
instance RealFloat CFloat |
instance RealFrac CFloat |
instance Show CFloat |
instance Storable CFloat |
data CDouble |
instance Enum CDouble |
instance Eq CDouble |
instance Floating CDouble |
instance Fractional CDouble |
instance Num CDouble |
instance Ord CDouble |
instance Read CDouble |
instance Real CDouble |
instance RealFloat CDouble |
instance RealFrac CDouble |
instance Show CDouble |
instance Storable CDouble |
data CFile |
data CFpos |
data CJmpBuf |
data ForeignPtr a |
The ForeignPtr is parameterised in the same way as Ptr. The type argument of ForeignPtr should normally be an instance of class Storable.
instance Eq (ForeignPtr a) |
instance Ord (ForeignPtr a) |
instance Show (ForeignPtr a) |
type FinalizerPtr a = FunPtr (Ptr a -> IO ()) |
type FinalizerEnvPtr env a = FunPtr (Ptr env -> Ptr a -> IO ()) |
newForeignPtr :: FinalizerPtr a -> Ptr a -> IO (ForeignPtr a) |
newForeignPtr_ :: Ptr a -> IO (ForeignPtr a) |
addForeignPtrFinalizer :: FinalizerPtr a -> ForeignPtr a -> IO () |
newForeignPtrEnv :: FinalizerEnvPtr env a |
-> Ptr env -> Ptr a -> IO (ForeignPtr a) |
addForeignPtrFinalizerEnv :: FinalizerEnvPtr env a |
-> Ptr env -> ForeignPtr a -> IO () |
withForeignPtr :: ForeignPtr a -> (Ptr a -> IO b) -> IO b |
This function is normally used for marshalling data to or from the object pointed to by the ForeignPtr, using the operations from the Storable class.
finalizeForeignPtr :: ForeignPtr a -> IO () |
unsafeForeignPtrToPtr :: ForeignPtr a -> Ptr a |
To avoid subtle coding errors, hand written marshalling code should preferably use Foreign.ForeignPtr.withForeignPtr rather than combinations of unsafeForeignPtrToPtr and touchForeignPtr. However, the latter routines are occasionally preferred in tool generated marshalling code.
touchForeignPtr :: ForeignPtr a -> IO () |
Note that this function should not be used to express dependencies between finalizers on ForeignPtrs. For example, if the finalizer for a ForeignPtrF1 calls touchForeignPtr on a second ForeignPtrF2, then the only guarantee is that the finalizer for F2 is never started before the finalizer for F1. They might be started together if for example both F1 and F2 are otherwise unreachable.
In general, it is not recommended to use finalizers on separate objects with ordering constraints between them. To express the ordering robustly requires explicit synchronisation between finalizers.
castForeignPtr :: ForeignPtr a -> ForeignPtr b |
mallocForeignPtr :: Storable a => IO (ForeignPtr a) |
mallocForeignPtr is equivalent to
although it may be implemented differently internally: you may not assume that the memory returned by mallocForeignPtr has been allocated with Foreign.Marshal.Alloc.malloc.
mallocForeignPtrBytes :: Int -> IO (ForeignPtr a) |
mallocForeignPtrArray :: Storable a => Int -> IO (ForeignPtr a) |
mallocForeignPtrArray0 :: Storable a => Int -> IO (ForeignPtr a) |
The module Foreign.Marshal re-exports the other modules in the Foreign.Marshal hierarchy:
module Foreign.Marshal.Alloc |
module Foreign.Marshal.Array |
module Foreign.Marshal.Error |
module Foreign.Marshal.Utils |
and provides one function:
unsafeLocalState :: IO a -> a |
The only IO operations allowed in the IO action passed to unsafeLocalState are (a) local allocation (alloca, allocaBytes and derived operations such as withArray and withCString), and (b) pointer operations (Foreign.Storable and Foreign.Ptr) on the pointers to local storage, and (c) foreign functions whose only observable effect is to read and/or write the locally allocated memory. Passing an IO operation that does not obey these rules results in undefined behaviour.
It is expected that this operation will be replaced in a future revision of Haskell.
The module Foreign.Marshal.Alloc provides operations to allocate and deallocate blocks of raw memory (i.e., unstructured chunks of memory outside of the area maintained by the Haskell storage manager). These memory blocks are commonly used to pass compound data structures to foreign functions or to provide space in which compound result values are obtained from foreign functions.
If any of the allocation functions fails, a value of nullPtr is produced. If free or reallocBytes is applied to a memory area that has been allocated with alloca or allocaBytes, the behaviour is undefined. Any further access to memory areas allocated with alloca or allocaBytes, after the computation that was passed to the allocation function has terminated, leads to undefined behaviour. Any further access to the memory area referenced by a pointer passed to realloc, reallocBytes, or free entails undefined behaviour.
All storage allocated by functions that allocate based on a size in bytes must be sufficiently aligned for any of the basic foreign types that fits into the newly allocated storage. All storage allocated by functions that allocate based on a specific type must be sufficiently aligned for that type. Array allocation routines need to obey the same alignment constraints for each array element.
alloca :: Storable a => (Ptr a -> IO b) -> IO b |
The memory is freed when f terminates (either normally or via an exception), so the pointer passed to f must not be used after this.
allocaBytes :: Int -> (Ptr a -> IO b) -> IO b |
The memory is freed when f terminates (either normally or via an exception), so the pointer passed to f must not be used after this.
malloc :: Storable a => IO (Ptr a) |
The memory may be deallocated using free or finalizerFree when no longer required.
mallocBytes :: Int -> IO (Ptr a) |
The memory may be deallocated using free or finalizerFree when no longer required.
realloc :: Storable b => Ptr a -> IO (Ptr b) |
If the argument to realloc is nullPtr, realloc behaves like malloc.
reallocBytes :: Ptr a -> Int -> IO (Ptr a) |
If the pointer argument to reallocBytes is nullPtr, reallocBytes behaves like malloc. If the requested size is 0, reallocBytes behaves like free.
free :: Ptr a -> IO () |
finalizerFree :: FinalizerPtr a |
The module Foreign.Marshal.Array provides operations for marshalling Haskell lists into monolithic arrays and vice versa. Most functions come in two flavours: one for arrays terminated by a special termination element and one where an explicit length parameter is used to determine the extent of an array. The typical example for the former case are C’s NUL terminated strings. However, please note that C strings should usually be marshalled using the functions provided by Foreign.C.String as the Unicode encoding has to be taken into account. All functions specifically operating on arrays that are terminated by a special termination element have a name ending on 0—e.g., mallocArray allocates space for an array of the given size, whereas mallocArray0 allocates space for one more element to ensure that there is room for the terminator.
mallocArray :: Storable a => Int -> IO (Ptr a) |
mallocArray0 :: Storable a => Int -> IO (Ptr a) |
allocaArray :: Storable a => Int -> (Ptr a -> IO b) -> IO b |
allocaArray0 :: Storable a => Int -> (Ptr a -> IO b) -> IO b |
reallocArray :: Storable a => Ptr a -> Int -> IO (Ptr a) |
reallocArray0 :: Storable a => Ptr a -> Int -> IO (Ptr a) |
peekArray :: Storable a => Int -> Ptr a -> IO [a] |
peekArray0 :: (Storable a, Eq a) => a -> Ptr a -> IO [a] |
pokeArray :: Storable a => Ptr a -> [a] -> IO () |
pokeArray0 :: Storable a => a -> Ptr a -> [a] -> IO () |
newArray :: Storable a => [a] -> IO (Ptr a) |
newArray0 :: Storable a => a -> [a] -> IO (Ptr a) |
withArray :: Storable a => [a] -> (Ptr a -> IO b) -> IO b |
withArray0 :: Storable a => a -> [a] -> (Ptr a -> IO b) -> IO b |
withArrayLen :: Storable a => [a] -> (Int -> Ptr a -> IO b) -> IO b |
withArrayLen0 :: Storable a => a |
-> [a] -> (Int -> Ptr a -> IO b) -> IO b |
(argument order: destination, source)
copyArray :: Storable a => Ptr a -> Ptr a -> Int -> IO () |
moveArray :: Storable a => Ptr a -> Ptr a -> Int -> IO () |
lengthArray0 :: (Storable a, Eq a) => a -> Ptr a -> IO Int |
advancePtr :: Storable a => Ptr a -> Int -> Ptr a |
throwIf |
:: | (a -> Bool) | error condition on the result of the IO action |
-> | (a -> String) | computes an error message from erroneous results of the IO action |
-> | IO a | the IO action to be executed |
-> | IO a | |
Execute an IO action, throwing a userError if the predicate yields True when applied to the result returned by the IO action. If no exception is raised, return the result of the computation.
throwIf_ :: (a -> Bool) -> (a -> String) -> IO a -> IO () |
throwIfNeg :: (Ord a, Num a) => (a -> String) -> IO a -> IO a |
throwIfNeg_ :: (Ord a, Num a) => (a -> String) -> IO a -> IO () |
throwIfNull :: String -> IO (Ptr a) -> IO (Ptr a) |
void :: IO a -> IO () |
with :: Storable a => a -> (Ptr a -> IO b) -> IO b |
The memory is freed when f terminates (either normally or via an exception), so the pointer passed to f must not be used after this.
new :: Storable a => a -> IO (Ptr a) |
The memory may be deallocated using Foreign.Marshal.Alloc.free or Foreign.Marshal.Alloc.finalizerFree when no longer required.
fromBool :: Num a => Bool -> a |
toBool :: Num a => a -> Bool |
maybeNew :: (a -> IO (Ptr a)) -> Maybe a -> IO (Ptr a) |
maybeWith :: (a -> (Ptr b -> IO c) -> IO c) |
-> Maybe a -> (Ptr b -> IO c) -> IO c |
maybePeek :: (Ptr a -> IO b) -> Ptr a -> IO (Maybe b) |
withMany :: (a -> (b -> res) -> res) -> [a] -> ([b] -> res) -> res |
(argument order: destination, source)
copyBytes :: Ptr a -> Ptr a -> Int -> IO () |
moveBytes :: Ptr a -> Ptr a -> Int -> IO () |
The module Foreign.Ptr provides typed pointers to foreign entities. We distinguish two kinds of pointers: pointers to data and pointers to functions. It is understood that these two kinds of pointers may be represented differently as they may be references to data and text segments, respectively.
data Ptr a |
The type a will often be an instance of class Foreign.Storable.Storable which provides the marshalling operations. However this is not essential, and you can provide your own operations to access the pointer. For example you might write small foreign functions to get or set the fields of a C struct.
instance Eq (Ptr a) |
instance Ord (Ptr a) |
instance Show (Ptr a) |
instance Storable (Ptr a) |
nullPtr :: Ptr a |
castPtr :: Ptr a -> Ptr b |
plusPtr :: Ptr a -> Int -> Ptr b |
alignPtr :: Ptr a -> Int -> Ptr a |
minusPtr :: Ptr a -> Ptr b -> Int |
data FunPtr a |
A value of type FunPtr a may be a pointer to a foreign function, either returned by another foreign function or imported with a a static address import like
or a pointer to a Haskell function created using a wrapper stub declared to produce a FunPtr of the correct type. For example:
Calls to wrapper stubs like mkCompare allocate storage, which should be released with Foreign.Ptr.freeHaskellFunPtr when no longer required.
To convert FunPtr values to corresponding Haskell functions, one can define a dynamic stub for the specific foreign type, e.g.
instance Eq (FunPtr a) |
instance Ord (FunPtr a) |
instance Show (FunPtr a) |
instance Storable (FunPtr a) |
nullFunPtr :: FunPtr a |
castFunPtr :: FunPtr a -> FunPtr b |
castFunPtrToPtr :: FunPtr a -> Ptr b |
Note: this is valid only on architectures where data and function pointers range over the same set of addresses, and should only be used for bindings to external libraries whose interface already relies on this assumption.
castPtrToFunPtr :: Ptr a -> FunPtr b |
Note: this is valid only on architectures where data and function pointers range over the same set of addresses, and should only be used for bindings to external libraries whose interface already relies on this assumption.
freeHaskellFunPtr :: FunPtr a -> IO () |
data IntPtr |
instance Bounded IntPtr |
instance Enum IntPtr |
instance Eq IntPtr |
instance Integral IntPtr |
instance Num IntPtr |
instance Ord IntPtr |
instance Read IntPtr |
instance Real IntPtr |
instance Show IntPtr |
instance Storable IntPtr |
instance Bits IntPtr |
ptrToIntPtr :: Ptr a -> IntPtr |
intPtrToPtr :: IntPtr -> Ptr a |
data WordPtr |
instance Bounded WordPtr |
instance Enum WordPtr |
instance Eq WordPtr |
instance Integral WordPtr |
instance Num WordPtr |
instance Ord WordPtr |
instance Read WordPtr |
instance Real WordPtr |
instance Show WordPtr |
instance Storable WordPtr |
instance Bits WordPtr |
ptrToWordPtr :: Ptr a -> WordPtr |
wordPtrToPtr :: WordPtr -> Ptr a |
data StablePtr a |
A value of type StablePtr a is a stable pointer to a Haskell expression of type a.
instance Eq (StablePtr a) |
instance Storable (StablePtr a) |
newStablePtr :: a -> IO (StablePtr a) |
deRefStablePtr :: StablePtr a -> IO a |
freeStablePtr :: StablePtr a -> IO () |
castStablePtrToPtr :: StablePtr a -> Ptr () |
castPtrToStablePtr :: Ptr () -> StablePtr a |
for any stable pointer sp on which freeStablePtr has not been executed yet. Moreover, castPtrToStablePtr may only be applied to pointers that have been produced by castStablePtrToPtr.
The following definition is available to C programs inter-operating with Haskell code when including the header HsFFI.h.
Note that no assumptions may be made about the values representing stable pointers. In fact, they need not even be valid memory addresses. The only guarantee provided is that if they are passed back to Haskell land, the function deRefStablePtr will be able to reconstruct the Haskell value referred to by the stable pointer.
class Storable a where |
Memory addresses are represented as values of type Ptr a, for some a which is an instance of class Storable. The type argument to Ptr helps provide some valuable type safety in FFI code (you can’t mix pointers of different types without an explicit cast), while helping the Haskell type system figure out which marshalling method is needed for a given pointer.
All marshalling between Haskell and a foreign language ultimately boils down to translating Haskell data structures into the binary representation of a corresponding data structure of the foreign language and vice versa. To code this marshalling in Haskell, it is necessary to manipulate primitive data types stored in unstructured memory blocks. The class Storable facilitates this manipulation on all types for which it is instantiated, which are the standard basic types of Haskell, the fixed size Int types (Int8, Int16, Int32, Int64), the fixed size Word types (Word8, Word16, Word32, Word64), StablePtr, all types from Foreign.C.Types, as well as Ptr.
Minimal complete definition: sizeOf, alignment, one of peek, peekElemOff and peekByteOff, and one of poke, pokeElemOff and pokeByteOff.
Methods
sizeOf :: a -> Int |
alignment :: a -> Int |
peekElemOff :: Ptr a -> Int -> IO a |
Note that this is only a specification, not necessarily the concrete implementation of the function.
pokeElemOff :: Ptr a -> Int -> a -> IO () |
peekByteOff :: Ptr b -> Int -> IO a |
pokeByteOff :: Ptr b -> Int -> a -> IO () |
peek :: Ptr a -> IO a |
Note that the peek and poke functions might require properly aligned addresses to function correctly. This is architecture dependent; thus, portable code should ensure that when peeking or poking values of some type a, the alignment constraint for a, as given by the function alignment is fulfilled.
poke :: Ptr a -> a -> IO () |
instance Storable Bool |
instance Storable Char |
instance Storable Double |
instance Storable Float |
instance Storable Int |
instance Storable Int8 |
instance Storable Int16 |
instance Storable Int32 |
instance Storable Int64 |
instance Storable Word |
instance Storable Word8 |
instance Storable Word16 |
instance Storable Word32 |
instance Storable Word64 |
instance Storable WordPtr |
instance Storable IntPtr |
instance Storable CChar |
instance Storable CSChar |
instance Storable CUChar |
instance Storable CShort |
instance Storable CUShort |
instance Storable CInt |
instance Storable CUInt |
instance Storable CLong |
instance Storable CULong |
instance Storable CLLong |
instance Storable CULLong |
instance Storable CFloat |
instance Storable CDouble |
instance Storable CPtrdiff |
instance Storable CSize |
instance Storable CWchar |
instance Storable CSigAtomic |
instance Storable CClock |
instance Storable CTime |
instance Storable CIntPtr |
instance Storable CUIntPtr |
instance Storable CIntMax |
instance Storable CUIntMax |
instance Storable (StablePtr a) |
instance Storable (Ptr a) |
instance Storable (FunPtr a) |
showSigned |
:: | Real a | |
=> | (a -> ShowS) | a function that can show unsigned values |
-> | Int | the precedence of the enclosing context |
-> | a | the value to show |
-> | ShowS | |
showIntAtBase :: Integral a => a -> (Int -> Char) -> a -> ShowS |
showInt :: Integral a => a -> ShowS |
showHex :: Integral a => a -> ShowS |
showOct :: Integral a => a -> ShowS |
showEFloat :: RealFloat a => Maybe Int -> a -> ShowS |
In the call showEFloat digs val, if digs is Nothing, the value is shown to full precision; if digs is Just d, then at most d digits after the decimal point are shown.
showFFloat :: RealFloat a => Maybe Int -> a -> ShowS |
In the call showFFloat digs val, if digs is Nothing, the value is shown to full precision; if digs is Just d, then at most d digits after the decimal point are shown.
showGFloat :: RealFloat a => Maybe Int -> a -> ShowS |
In the call showGFloat digs val, if digs is Nothing, the value is shown to full precision; if digs is Just d, then at most d digits after the decimal point are shown.
showFloat :: RealFloat a => a -> ShowS |
floatToDigits :: RealFloat a => Integer -> a -> ([Int], Int) |
then
NB: readInt is the ’dual’ of showIntAtBase, and readDec is the ‘dual’ of showInt. The inconsistent naming is a historical accident.
readSigned :: Real a => ReadS a -> ReadS a |
readInt |
readDec :: Num a => ReadS a |
readOct :: Num a => ReadS a |
readHex :: Num a => ReadS a |
readFloat :: RealFrac a => ReadS a |
lexDigits :: ReadS String |
fromRat :: RealFloat a => Rational -> a |
getArgs :: IO [String] |
getProgName :: IO String |
However, this is hard-to-impossible to implement on some non-Unix OSes, so instead, for maximum portability, we just return the leafname of the program as invoked. Even then there are some differences between platforms: on Windows, for example, a program invoked as foo is probably really FOO.EXE, and that is what getProgName will return.
getEnv :: String -> IO String |
This computation may fail with:
data ExitCode |
= | ExitSuccess | indicates successful termination; |
| | ExitFailure Int | indicates program failure with an exit code. The exact interpretation of the code is operating-system dependent. In particular, some values may be prohibited (e.g. 0 on a POSIX-compliant system). |
Defines the exit codes that a program can return.
instance Eq ExitCode |
instance Ord ExitCode |
instance Read ExitCode |
instance Show ExitCode |
exitWith :: ExitCode -> IO a |
exitFailure :: IO a |
exitSuccess :: IO a |
data IO a |
There is really only one way to ”perform” an I/O action: bind it to Main.main in your program. When your program is run, the I/O will be performed. It isn’t possible to perform I/O from an arbitrary function, unless that function is itself in the IO monad and called at some point, directly or indirectly, from Main.main.
IO is a monad, so IO actions can be combined using either the do-notation or the >> and >>= operations from the Monad class.
instance Monad IO |
instance Functor IO |
type FilePath = String |
data Handle |
Most handles will also have a current I/O position indicating where the next input or output operation will occur. A handle is readable if it manages only input or both input and output; likewise, it is writable if it manages only output or both input and output. A handle is open when first allocated. Once it is closed it can no longer be used for either input or output, though an implementation cannot re-use its storage while references remain to it. Handles are in the Show and Eq classes. The string produced by showing a handle is system dependent; it should include enough information to identify the handle for debugging. A handle is equal according to == only to itself; no attempt is made to compare the internal state of different handles for equality.
instance Eq Handle |
instance Show Handle |
Three handles are allocated during program initialisation, and are initially open.
stdin :: Handle |
stdout :: Handle |
stderr :: Handle |
withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r |
openFile :: FilePath -> IOMode -> IO Handle |
If the file does not exist and it is opened for output, it should be created as a new file. If mode is WriteMode and the file already exists, then it should be truncated to zero length. Some operating systems delete empty files, so there is no guarantee that the file will exist following an openFile with modeWriteMode unless it is subsequently written to successfully. The handle is positioned at the end of the file if mode is AppendMode, and otherwise at the beginning (in which case its internal position is 0). The initial buffer mode is implementation-dependent.
This operation may fail with:
data IOMode |
= | ReadMode | |
| | WriteMode | |
| | AppendMode | |
| | ReadWriteMode | |
See System.IO.openFile
instance Enum IOMode |
instance Eq IOMode |
instance Ord IOMode |
instance Read IOMode |
instance Show IOMode |
instance Ix IOMode |
hClose :: Handle -> IO () |
These functions are also exported by the Prelude.
readFile :: FilePath -> IO String |
writeFile :: FilePath -> String -> IO () |
appendFile :: FilePath -> String -> IO () |
Note that writeFile and appendFile write a literal string to a file. To write a value of any printable type, as with print, use the show function to convert the value to a string first.
Implementations should enforce as far as possible, at least locally to the Haskell process, multiple-reader single-writer locking on files. That is, there may either be many handles on the same file which manage input, or just one handle on the file which manages output. If any open or semi-closed handle is managing a file for output, no new handle can be allocated for that file. If any open or semi-closed handle is managing a file for input, new handles can only be allocated if they do not manage output. Whether two files are the same is implementation-dependent, but they should normally be the same if they have the same absolute path name and neither has been renamed, for example.
Warning: the readFile operation holds a semi-closed handle on the file until the entire contents of the file have been consumed. It follows that an attempt to write to a file (using writeFile, for example) that was earlier opened by readFile will usually result in failure with System.IO.Error.isAlreadyInUseError.
hFileSize :: Handle -> IO Integer |
hSetFileSize :: Handle -> Integer -> IO () |
hIsEOF :: Handle -> IO Bool |
NOTE: hIsEOF may block, because it has to attempt to read from the stream to determine whether there is any more data to be read.
isEOF :: IO Bool |
data BufferMode |
Three kinds of buffering are supported: line-buffering, block-buffering or no-buffering. These modes have the following effects. For output, items are written out, or flushed, from the internal buffer according to the buffer mode:
An implementation is free to flush the buffer more frequently, but not less frequently, than specified above. The output buffer is emptied as soon as it has been written out.
Similarly, input occurs according to the buffer mode for the handle:
The default buffering mode when a handle is opened is implementation-dependent and may depend on the file system object which is attached to that handle. For most implementations, physical files will normally be block-buffered and terminals will normally be line-buffered.
instance Eq BufferMode |
instance Ord BufferMode |
instance Read BufferMode |
instance Show BufferMode |
hSetBuffering :: Handle -> BufferMode -> IO () |
If the buffer mode is changed from BlockBuffering or LineBuffering to NoBuffering, then
This operation may fail with:
hGetBuffering :: Handle -> IO BufferMode |
hFlush :: Handle -> IO () |
This operation may fail with:
hGetPosn :: Handle -> IO HandlePosn |
hSetPosn :: HandlePosn -> IO () |
This operation may fail with:
data HandlePosn |
instance Eq HandlePosn |
instance Show HandlePosn |
hSeek :: Handle -> SeekMode -> Integer -> IO () |
If hdl is block- or line-buffered, then seeking to a position which is not in the current buffer will first cause any items in the output buffer to be written to the device, and then cause the input buffer to be discarded. Some handles may not be seekable (see hIsSeekable), or only support a subset of the possible positioning operations (for instance, it may only be possible to seek to the end of a tape, or to a positive offset from the beginning or current position). It is not possible to set a negative I/O position, or for a physical file, an I/O position beyond the current end-of-file.
This operation may fail with:
data SeekMode |
= | AbsoluteSeek | the position of hdl is set to i. |
| | RelativeSeek | the position of hdl is set to offset i from the current position. |
| | SeekFromEnd | the position of hdl is set to offset i from the end of the file. |
A mode that determines the effect of hSeekhdl mode i.
instance Enum SeekMode |
instance Eq SeekMode |
instance Ord SeekMode |
instance Read SeekMode |
instance Show SeekMode |
instance Ix SeekMode |
hTell :: Handle -> IO Integer |
This operation may fail with:
Each of these operations returns True if the handle has the the specified property, or False otherwise.
hIsTerminalDevice :: Handle -> IO Bool |
hSetEcho :: Handle -> Bool -> IO () |
hGetEcho :: Handle -> IO Bool |
hShow :: Handle -> IO String |
hWaitForInput :: Handle -> Int -> IO Bool |
If t is less than zero, then hWaitForInput waits indefinitely.
This operation may fail with:
hReady :: Handle -> IO Bool |
This operation may fail with:
hGetChar :: Handle -> IO Char |
This operation may fail with:
hGetLine :: Handle -> IO String |
This operation may fail with:
If hGetLine encounters end-of-file at any other point while reading in a line, it is treated as a line terminator and the (partial) line is returned.
hLookAhead :: Handle -> IO Char |
This operation may fail with:
hGetContents :: Handle -> IO String |
Any operation that fails because a handle is closed, also fails if a handle is semi-closed. The only exception is hClose. A semi-closed handle becomes closed:
Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.
Any I/O errors encountered while a handle is semi-closed are simply discarded.
This operation may fail with:
hPutChar :: Handle -> Char -> IO () |
This operation may fail with:
hPutStr :: Handle -> String -> IO () |
This operation may fail with:
hPutStrLn :: Handle -> String -> IO () |
hPrint :: Show a => Handle -> a -> IO () |
This operation may fail with:
These functions are also exported by the Prelude.
interact :: (String -> String) -> IO () |
putChar :: Char -> IO () |
putStr :: String -> IO () |
putStrLn :: String -> IO () |
print :: Show a => a -> IO () |
For example, a program to print the first 20 integers and their powers of 2 could be written as:
getChar :: IO Char |
getLine :: IO String |
getContents :: IO String |
readIO :: Read a => String -> IO a |
readLn :: Read a => IO a |
type IOError = IOError |
userError :: String -> IOError |
mkIOError :: IOErrorType |
-> String -> Maybe Handle -> Maybe FilePath -> IOError |
annotateIOError :: IOError |
-> String -> Maybe Handle -> Maybe FilePath -> IOError |
isAlreadyExistsError :: IOError -> Bool |
isDoesNotExistError :: IOError -> Bool |
isAlreadyInUseError :: IOError -> Bool |
isFullError :: IOError -> Bool |
isEOFError :: IOError -> Bool |
isIllegalOperation :: IOError -> Bool |
isPermissionError :: IOError -> Bool |
isUserError :: IOError -> Bool |
data IOErrorType |
instance Eq IOErrorType |
instance Show IOErrorType |
alreadyExistsErrorType :: IOErrorType |
doesNotExistErrorType :: IOErrorType |
alreadyInUseErrorType :: IOErrorType |
fullErrorType :: IOErrorType |
eofErrorType :: IOErrorType |
illegalOperationErrorType :: IOErrorType |
permissionErrorType :: IOErrorType |
userErrorType :: IOErrorType |
ioError :: IOError -> IO a |
catch :: IO a -> (IOError -> IO a) -> IO a |
the function f returns [] when an end-of-file exception (cf. isEOFError) occurs in g; otherwise, the exception is propagated to the next outer handler.
When an exception propagates outside the main program, the Haskell system prints the associated IOError value and exits the program.
try :: IO a -> IO (Either IOError a) |
[1] J. Backus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. CACM, 21(8):613–641, August 1978.
[2] Unicode Consortium. Unicode standard. http://unicode.org/standard/standard.html.
[3] H.K. Curry and R. Feys. Combinatory Logic. North-Holland Pub. Co., Amsterdam, 1958.
[4] Luis Damas and Robin Milner. Principal type-schemes for functional programs. In Conference Record of the 9th Annual ACM Symposium on Principles of Programming Languages, pages 207–12, New York, 1982. ACM Press.
[5] James Gosling, Bill Joy, and Guy Steele. The Java Language Specification. The Java Series. Addison-Wesley, 1997.
[6] R. Hindley. The principal type scheme of an object in combinatory logic. Transactions of the American Mathematical Society, 146:29–60, December 1969.
[7] International Standard ISO/IEC. Programming languages – C. 9899:1999 (E).
[8] MP Jones. A system of constructor classes: overloading and implicit higher-order polymorphism. Journal of Functional Programming, 5(1):1–36, January 1995.
[9] Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Prentice Hall, second edition, 1988.
[10] Sheng Liang. The Java Native Interface: Programmer’s Guide and Specification. Addison Wesley, 1999.
[11] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-Wesley, 1996.
[12] P. Penfield, Jr. Principal values and branch cuts in complex APL. In APL ’81 Conference Proceedings, pages 248–256, San Francisco, September 1981.
[13] P. Wadler and S. Blott. How to make ad hoc polymorphism less ad hoc. In Proceedings of 16th ACM Symposium on Principles of Programming Languages, pages 60–76, Austin, Texas, January 1989.