Type safety
Encyclopedia : T : TY : TYP : Type safety
In computer science, type safety is a property attributed to some, but not all, programming languages. The term is defined differently by different communities who use it — in particular, the formal type-theoretic definition is considerably stronger than what is understood by most programmers — but most uses have in common the notion of employing a type system to prevent certain forms of erroneous or undesirable program behavior (called type errors). This enforcement can be static, catching potential errors at compile time, or dynamic, associating type information with values at run time and consulting them as needed to detect imminent errors, or a combination of both. Type safety is a property of the programming language, not of the programs themselves. For example, it is possible to write a safe program in a type-unsafe language.
The behaviors classified as type errors by any given programming language are generally those that result from attempts to perform on some value (or values) an operation that is not appropriate to its type (or their types). Which of these notions is more fundamental is to a certain extent a matter of opinion: some language designers and programmers take the view that any operation not leading to program crashes, security flaws or other obvious failures is legitimate and need not be considered an error, while others prefer to classify any contravention of the programmer's intent (as communicated via typing annotations) to be erroneous and deserving of the label "unsafe". In the context of static type systems, type safety usually involves (among other things) a guarantee that the eventual value of any expression will be a legitimate member of that expression's static type (the precise requirement is more subtle than this — see, for example, subtype and polymorphism for complications).
Type safety is closely linked to so-called memory safety. To see why, assume for purposes of discussion that a certain implementation of a certain language has some type, say [t], such that there exists some sequence of bits (of the appropriate length) that is not a legitimate member of [t]. (All but the most degenerate type systems satisfy this criterion.) If that language is memory-unsafe to the extent of allowing data to be copied from unallocated or uninitialized memory into a variable of type [t], then it is not type safe, because such an operation might assign a non-[t] value to that variable. Conversely, if the language is type unsafe to the extent of allowing an arbitrary integer to be used as a pointer, then it is clearly not memory safe.
Most statically-typed languages provide a degree of type safety that is strictly stronger than memory safety, because their type systems enforce the proper use of abstract data types defined by programmers even when this is not strictly necessary for memory safety or for the prevention of any kind of catastrophic failure.
Definitions
Robin Milner provided the following slogan to describe type safety:
- "Well-typed programs never go wrong."
In 1994, Andrew Wright and Matthias Felleisen formulated what is now the standard definition and proof technique for type safety in languages defined by operational semantics. Under this approach, type safety is determined by two properties of the semantics of the programming language:
- (Type-) preservation
- "Well typedness" of programs remains invariant under the transition rules (i.e. evaluation rules or reduction rules) of the language.
- Progress
- A well typed program never gets "stuck", i.e., never gets into an undefined state where no further transitions are possible.
Type-safe and type-unsafe languages
Type safety is usually a requirement for any toy language proposed in programming language research. Many languages, on the other hand, are too big for human-generated type-safety proofs, as they often require checking thousands of cases. Nevertheless, languages such as Standard ML have a rigorously defined semantics and have been shown to be type safe. Some other languages such as Haskell and Java are believed to be type safe. Regardless of the properties of the language definition, certain errors may occur at runtime due to bugs in the implementation, or in linked libraries written in other languages, rendering a given implementation type unsafe in certain circumstances.
Memory management in type-safe languages
In order for a language to be completely type safe it either needs to have garbage collection or place other restrictions on the allocation and deallocation of memory (this section deals mainly with the former). Specifically, the language must not allow dangling pointers across structurally different types to exist. The reason is technical: suppose that a typed language (like Pascal) required that allocated memory had to be explicitly released. If a dangling pointer existed that still pointed to the old memory location, it is possible that a new data structure can get allocated in the same space with the slot the dangling pointer refers to now pointing to a different type. For example, if the pointer initially pointed to a structure with an integer field, but in the new object a pointer field was allocated in the place of the integer, then the pointer field could be changed to anything simply by changing the value of the integer field (via dereferencing the dangling pointer). Because it is not specified what would happen when such a pointer is changed, the language is not type safe. Most type-safe languages satisfy these restrictions by using garbage collection to implement memory management.
Garbage collectors themselves are best implemented in languages that allow pointer arithmetic, so the library that implements the collector itself is best done in a type-unsafe language or a language where type safety can be deactivated. C and C++ are often used.
Type safety and strong typing
Type safety is synonymous with one of the many definitions of strong typing; however, type safety and dynamic typing are not mutually exclusive. A dynamically typed language can be seen as a statically-typed language with a very permissive type system under which any syntactically correct program is well-typed; as long as its dynamic semantics ensures that no such program ever "goes wrong" in an appropriate sense, it satisfies the definition above and can be called type-safe.
Type safety issues in specific languages
Ada
The SPARK programming language is a subset of Ada eliminating all its potential ambiguities and insecurities while at the same time adding design by contract to the language features available.
C
Standard ML
See also
References
- Benjamin C. Pierce, Types and Programming Languages, MIT Press, 2002. (ISBN 0-262-16209-1) [link]
- Type Safe defined in the Portland Pattern Repository's Wiki [link]
- Andrew K. Wright and Matthias Felleisen, "A Syntactic Approach to Type Soundness," Information and Computation 115(1), pp. 38-94, 1994. [link]
- Stavros Macrakis, "Safety and power", ACM SIGSOFT Software Engineering Notes 7:2:25 (April 1982)[requires subscription]
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
