text-2.0: An efficient packed Unicode text type.
Copyright(c) Bryan O'Sullivan 2009 2012
LicenseBSD-style
Maintainerbos@serpentine.com
Stabilityexperimental
PortabilityGHC
Safe HaskellSafe-Inferred
LanguageHaskell2010

Data.Text.Internal.Fusion.Common

Description

Warning: this is an internal module, and does not have a stable API or name. Functions in this module may not check or enforce preconditions expected by public modules. Use at your own risk!

This module provides a common stream fusion interface for text. The stream interface allows us to write text pipelines which do not allocate intermediate text values. For example, we could guarantee no intermediate text is allocated by writing the following:

  getNucleotides :: Text -> Text
  getNucleotides =
        unstream
      . filter isNucleotide
      . toLower
      . stream
    where
      isNucleotide chr =
        chr == 'a' ||
        chr == 'c' ||
        chr == 't' ||
        chr == 'g'
Synopsis

Creation and elimination

singleton :: Char -> Stream Char Source #

O(1) Convert a character into a Stream

Properties

unstream . singleton = singleton

streamList :: [a] -> Stream a Source #

O(n) Convert a list into a Stream.

Properties

unstream . streamList = pack

unstreamList :: Stream a -> [a] Source #

O(n) Convert a Stream into a list.

Properties

unstreamList . stream = unpack

streamCString# :: Addr# -> Stream Char Source #

Stream the UTF-8-like packed encoding used by GHC to represent constant strings in generated code.

This encoding uses the byte sequence "xc0x80" to represent NUL, and the string is NUL-terminated.

Properties

 unstream . streamCString# addr# = unpackCString# addr#

Basic interface

cons :: Char -> Stream Char -> Stream Char Source #

O(n) Adds a character to the front of a Stream Char.

Properties

 unstream . cons c . stream = cons c

snoc :: Stream Char -> Char -> Stream Char Source #

O(n) Adds a character to the end of a stream.

Properties

 unstream . snoc c . stream = snoc c

append :: Stream Char -> Stream Char -> Stream Char Source #

O(n) Appends one Stream to the other.

Properties

 unstream (append (stream t1) (stream t2)) = append t1 t2

head :: Stream Char -> Char Source #

O(1) Returns the first character of a Stream Char, which must be non-empty.

Properties

 head . stream = head

uncons :: Stream Char -> Maybe (Char, Stream Char) Source #

O(1) Returns the first character and remainder of a Stream Char, or Nothing if empty.

Properties

 fmap fst . uncons . stream = fmap fst . uncons
 fmap (unstream . snd) . uncons . stream = fmap snd . uncons

last :: Stream Char -> Char Source #

O(n) Returns the last character of a Stream Char, which must be non-empty.

Properties

 last . stream = last

tail :: Stream Char -> Stream Char Source #

O(1) Returns all characters after the head of a Stream Char, which must be non-empty.

Properties

 unstream . tail . stream = tail

init :: Stream Char -> Stream Char Source #

O(1) Returns all but the last character of a Stream Char, which must be non-empty.

Properties

 unstream . init . stream = init

null :: Stream Char -> Bool Source #

O(1) Tests whether a Stream Char is empty or not.

Properties

 null . stream = null

lengthI :: Integral a => Stream Char -> a Source #

O(n) Returns the number of characters in a string.

compareLengthI :: Integral a => Stream Char -> a -> Ordering Source #

O(n) Compares the count of characters in a string to a number.

This function gives the same answer as comparing against the result of lengthI, but can short circuit if the count of characters is greater than the number or if the stream can't possibly be as long as the number supplied, and hence be more efficient.

isSingleton :: Stream Char -> Bool Source #

O(n) Indicate whether a string contains exactly one element.

Properties

 isSingleton . stream = isSingleton

Transformations

map :: (Char -> Char) -> Stream Char -> Stream Char Source #

O(n) map f xs is the Stream Char obtained by applying f to each element of xs.

Properties

 unstream . map f . stream = map f

intercalate :: Stream Char -> [Stream Char] -> Stream Char Source #

intercalate str strs interts the stream str in between the streams strs and concatenates the result.

Properties

 intercalate s = concat . intersperse s

intersperse :: Char -> Stream Char -> Stream Char Source #

O(n) Take a character and place it between each of the characters of a 'Stream Char'.

Properties

 unstream . intersperse c . stream = intersperse c

Case conversion

With Unicode text, it is incorrect to use combinators like map toUpper to case convert each character of a string individually. Instead, use the whole-string case conversion functions from this module. For correctness in different writing systems, these functions may map one input character to two or three output characters.

toCaseFold :: Stream Char -> Stream Char Source #

O(n) Convert a string to folded case. This function is mainly useful for performing caseless (or case insensitive) string comparisons.

A string x is a caseless match for a string y if and only if:

toCaseFold x == toCaseFold y

The result string may be longer than the input string, and may differ from applying toLower to the input string. For instance, the Armenian small ligature men now (U+FB13) is case folded to the bigram men now (U+0574 U+0576), while the micro sign (U+00B5) is case folded to the Greek small letter letter mu (U+03BC) instead of itself.

toLower :: Stream Char -> Stream Char Source #

O(n) Convert a string to lower case, using simple case conversion. The result string may be longer than the input string. For instance, the Latin capital letter I with dot above (U+0130) maps to the sequence Latin small letter i (U+0069) followed by combining dot above (U+0307).

Properties

 unstream . toLower . stream = toLower

toTitle :: Stream Char -> Stream Char Source #

O(n) Convert a string to title case, using simple case conversion.

The first letter of the input is converted to title case, as is every subsequent letter that immediately follows a non-letter. Every letter that immediately follows another letter is converted to lower case.

The result string may be longer than the input string. For example, the Latin small ligature fl (U+FB02) is converted to the sequence Latin capital letter F (U+0046) followed by Latin small letter l (U+006C).

Note: this function does not take language or culture specific rules into account. For instance, in English, different style guides disagree on whether the book name "The Hill of the Red Fox" is correctly title cased—but this function will capitalize every word.

Properties

 unstream . toTitle . stream = toTitle

toUpper :: Stream Char -> Stream Char Source #

O(n) Convert a string to upper case, using simple case conversion. The result string may be longer than the input string. For instance, the German eszett (U+00DF) maps to the two-letter sequence SS.

Properties

 unstream . toUpper . stream = toUpper

Justification

Folds

foldl :: (b -> Char -> b) -> b -> Stream Char -> b Source #

foldl, applied to a binary operator, a starting value (typically the left-identity of the operator), and a Stream, reduces the Stream using the binary operator, from left to right.

Properties

 foldl f z0 . stream = foldl f z0

foldl' :: (b -> Char -> b) -> b -> Stream Char -> b Source #

A strict version of foldl.

Properties

 foldl' f z0 . stream = foldl' f z0

foldl1 :: (Char -> Char -> Char) -> Stream Char -> Char Source #

foldl1 is a variant of foldl that has no starting value argument, and thus must be applied to non-empty Streams.

Properties

 foldl1 f . stream = foldl1 f

foldl1' :: (Char -> Char -> Char) -> Stream Char -> Char Source #

A strict version of foldl1.

Properties

 foldl1' f . stream = foldl1' f

foldr :: (Char -> b -> b) -> b -> Stream Char -> b Source #

foldr, applied to a binary operator, a starting value (typically the right-identity of the operator), and a stream, reduces the stream using the binary operator, from right to left.

Properties

 foldr f z0 . stream = foldr f z0

foldr1 :: (Char -> Char -> Char) -> Stream Char -> Char Source #

foldr1 is a variant of foldr that has no starting value argument, and thus must be applied to non-empty streams.

Properties

 foldr1 f . stream = foldr1 f

Special folds

concat :: [Stream Char] -> Stream Char Source #

O(n) Concatenate a list of streams.

Properties

unstream . concat . fmap stream  = concat

concatMap :: (Char -> Stream Char) -> Stream Char -> Stream Char Source #

Map a function over a stream that results in a stream and concatenate the results.

Properties

unstream . concatMap (stream . f) . stream = concatMap f

any :: (Char -> Bool) -> Stream Char -> Bool Source #

O(n) any p xs determines if any character in the stream xs satisfies the predicate p.

Properties

any f . stream = any f

all :: (Char -> Bool) -> Stream Char -> Bool Source #

O(n) all p xs determines if all characters in the Text xs satisfy the predicate p.

Properties

all f . stream = all f

maximum :: Stream Char -> Char Source #

O(n) maximum returns the maximum value from a stream, which must be non-empty.

Properties

maximum . stream = maximum

minimum :: Stream Char -> Char Source #

O(n) minimum returns the minimum value from a Text, which must be non-empty.

Properties

minimum . stream = minimum

Construction

Scans

scanl :: (Char -> Char -> Char) -> Char -> Stream Char -> Stream Char Source #

O(n) scanl is similar to foldl, but returns a stream of successive reduced values from the left. Conceptually, if we write the input stream as a list then we have:

scanl f z [x1, x2, ...] == [z, z 'f' x1, (z 'f' x1) 'f' x2, ...]

Properties

head (scanl f z xs) = z
last (scanl f z xs) = foldl f z xs

Generation and unfolding

replicateCharI :: Integral a => a -> Char -> Stream Char Source #

O(n) replicateCharI n c is a Stream Char of length n with c the value of every element.

replicateI :: Int64 -> Stream Char -> Stream Char Source #

O(n*m) replicateI n t is a Stream Char consisting of the input t repeated n times.

unfoldr :: (a -> Maybe (Char, a)) -> a -> Stream Char Source #

O(n), where n is the length of the result. The unfoldr function is analogous to the List unfoldr. unfoldr builds a stream from a seed value. The function takes the element and returns Nothing if it is done producing the stream or returns Just (a,b), in which case, a is the next Char in the string, and b is the seed value for further production.

Properties

unstream . unfoldr f z = unfoldr f z

unfoldrNI :: Integral a => a -> (b -> Maybe (Char, b)) -> b -> Stream Char Source #

O(n) Like unfoldr, unfoldrNI builds a stream from a seed value. However, the length of the result is limited by the first argument to unfoldrNI. This function is more efficient than unfoldr when the length of the result is known.

Properties

unstream (unfoldrNI n f z) = unfoldrN n f z

Substrings

Breaking strings

take :: Integral a => a -> Stream Char -> Stream Char Source #

O(n) take n, applied to a stream, returns the prefix of the stream of length n, or the stream itself if n is greater than the length of the stream.

Properties

unstream . take n . stream = take n

drop :: Integral a => a -> Stream Char -> Stream Char Source #

O(n) drop n, applied to a stream, returns the suffix of the stream after the first n characters, or the empty stream if n is greater than the length of the stream.

Properties

unstream . drop n . stream = drop n

takeWhile :: (Char -> Bool) -> Stream Char -> Stream Char Source #

takeWhile, applied to a predicate p and a stream, returns the longest prefix (possibly empty) of elements that satisfy p.

Properties

unstream . takeWhile p . stream = takeWhile p

dropWhile :: (Char -> Bool) -> Stream Char -> Stream Char Source #

dropWhile p xs returns the suffix remaining after takeWhile p xs.

Properties

unstream . dropWhile p . stream = dropWhile p

Predicates

isPrefixOf :: Eq a => Stream a -> Stream a -> Bool Source #

O(n) The isPrefixOf function takes two Streams and returns True iff the first is a prefix of the second.

Properties

 isPrefixOf (stream t1) (stream t2) = isPrefixOf t1 t2

Searching

elem :: Char -> Stream Char -> Bool Source #

O(n) elem is the stream membership predicate.

Properties

 elem c . stream = elem c

filter :: (Char -> Bool) -> Stream Char -> Stream Char Source #

O(n) filter, applied to a predicate and a stream, returns a stream containing those characters that satisfy the predicate.

Properties

 unstream . filter p . stream = filter p

Indexing

findBy :: (Char -> Bool) -> Stream Char -> Maybe Char Source #

O(n) The findBy function takes a predicate and a stream, and returns the first element in matching the predicate, or Nothing if there is no such element.

Properties

 findBy p . stream = find p

indexI :: Integral a => Stream Char -> a -> Char Source #

O(n) Stream index (subscript) operator, starting from 0.

Properties

 indexI (stream t) n = index t n

findIndexI :: Integral a => (Char -> Bool) -> Stream Char -> Maybe a Source #

The findIndexI function takes a predicate and a stream and returns the index of the first element in the stream satisfying the predicate.

Properties

findIndexI p . stream = findIndex p

countCharI :: Integral a => Char -> Stream Char -> a Source #

O(n) The countCharI function returns the number of times the query element appears in the given stream.

Properties

countCharI c . stream = countChar c

Zipping and unzipping

zipWith :: (a -> a -> b) -> Stream a -> Stream a -> Stream b Source #

zipWith generalises zip by zipping with the function given as the first argument, instead of a tupling function.

Properties

 unstream (zipWith f (stream t1) (stream t2)) = zipWith f t1 t2