15 : DONT CROSS THE STREAMS!
25 Mar 2022In this weeks episode we're talking about Streams. We can never remember which one to use or if we need to always create a Stream and then use a StreamReader/StreamWriter, or can we use the static File.IO methods? Hopefully this will clear up some of our confusion.
Andy: There’s something I forgot to tell you. Don’t cross the streams.
Rowan: Why?
Andy: It would be bad.
Rowan: I’m fuzzy on the whole good/bad thing. What do you mean “bad”?
Andy: Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light.
Andy: Total protonic reversal.
Rowan: That’s bad. Okay. Alright, important safety tip, thanks Andy.
Random fact
- The original title was Ghost Smashers.
- Director Ivan Reitman made a couple of unorthodox appearances in the movie
-
- For the "pigging out" noises of Slimer pigging out on a pile of food before he slimes Peter Venkman (Bill Murray’s character)
-
- Reitman’s naturally deep voice also proved perfect for the moment when Dana becomes possessed and says “There is no Dana, only Zuul,”
- Bill Murray replies with “What a lovely singing voice you must have” 😂
Introduction
A stream is used to transfer data (read/write) from/to a wide range of source/destinations.
So when you say streams, do you also mean things like Reactive Extensions (linq over events), or IAsyncEnumerable or are we talking about the low level, bit level access?
There is a generic stream class System.IO.Stream, which all other stream classes in .NET derive from (FileStream, MemoryStream, many others) - this provides an abstraction that all other subclasses must abide by, but this basically allows the reading of writing of bytes (not text) from/to a source or backing store. Backing stores can be Files, IO devices, Network - sometimes there is no backing store e.g. Signal generator, Events etc
We can group the functions of the Stream class in three categories.
- Reading And Writing
- Seeking
- Buffering, Flushing and Disposing - this is important as it could be unmanaged resource just don’t forget to wrap it with a Using
Common properties and methods of Streams
Properties:
- CanWrite
- CanRead
- CanSeek
- Length
- Position
- ReadTimeout
- WriteTimeout
Key methods:
- BeginRead / EndRead / Read / ReadByte / ReadAsync
- BeginWrite / EndWrite / Write / WriteByte / WriteAsync
- Seek
- CopyTo / CopyToAsync
- Flush / FlushAsync
Reader and Writer classes (for writing encoded character data to streams) - always confuse me:
Point of confusion for writing to a file:
I always find myself asking - do I use a StreamWriter or use the Write method on a FileStream and convert to bytes myself, or just use the File.WriteAllText or File.AppendAllLines?
Reader and Writer types are for reading encoded characters from streams and writing them to streams - they do the conversions from / to byte data
This is because Streams are designed for byte input and output, therefore other classes are needed to do the conversion
I’m always confused around which reader / writer to use, particularly in the case of BinaryReader / Writer, or should I just write directly to the Stream if I can get bytes?
Example reader / writer classes are:
- BinaryReader and BinaryWriter – for reading and writing primitive data types as binary values - simplify writing primitive data types to a stream
- StreamReader and StreamWriter – for reading and writing characters by using an encoding value to convert the characters to and from bytes.
- StringReader and StringWriter – for reading and writing characters to and from strings - personally never found a need to use these.
- ABSTRACT CLASS: TextReader and TextWriter – serve as the abstract base classes for the two above and other readers and writers that read and write characters and strings, but not binary data.
Lets look at a few examples of the common implementations we use
FileStream - provides read and write file operations
FileStream and StreamWriter - write out some text
using var fs = File.OpenWrite("output_textfile.txt");
using var sw = new StreamWriter(fs);
sw.WriteLine("Test 1");
sw.WriteLine("Test 2");
sw.WriteLine("Test 3");
sw.WriteLine("Test 4");
FileStream and StreamReader - read the text back in
using var fs = File.OpenRead("output_textfile.txt");
using var sr = new StreamReader(fs);
while (!sr.EndOfStream)
{
Console.WriteLine(sr.ReadLine());
}
or simpler versions without the need for the FileStream as StreamWriter/StreamReader have an overload that takes a path.
Write
using var sw = new StreamWriter("output_textfile.txt");
sw.WriteLine("Test 1");
sw.WriteLine("Test 2");
sw.WriteLine("Test 3");
sw.WriteLine("Test 4");
Read
using var sr = new StreamReader("output_textfile.txt");
while (!sr.EndOfStream)
{
Console.WriteLine(sr.ReadLine());
}
You can even use the System.IO.File
static methods to simplify things even further
using (var sw = File.CreateText("newfile.txt"))
sw.WriteLine("First line of example");
sw.WriteLine("and second line");
using (var sr = File.OpenText("newfile.txt"))
while (!sr.EndOfStream)
{
Console.WriteLine(sr.ReadLine());
}
FileStream with BinaryWriter and BinaryReader
float aspectRatio;
string tempDirectory;
int autoSaveTime;
bool showStatusBar;
using (var stream = File.Open("output_binfile.bin", FileMode.Create))
{
using (var writer = new BinaryWriter(stream))
{
writer.Write(1.250F);
writer.Write(@"c:\Temp");
writer.Write(10);
writer.Write(true);
}
}
using (var stream = File.Open("output_binfile.bin", FileMode.Open))
{
using (var reader = new BinaryReader(stream))
{
aspectRatio = reader.ReadSingle();
tempDirectory = reader.ReadString();
autoSaveTime = reader.ReadInt32();
showStatusBar = reader.ReadBoolean();
}
}
Span
So in summary just for reading/writing files...
when trying to read text files use StreamReader
when trying to write text files use StreamWriter
when trying to read binary files use FileStream with BinaryReader
when trying to write binary files use FileStream with BinaryWriter
MemoryStream - in memory stream, might use to prepare to write to another stream
eg) query data from a database, processing data, write all contents to MemoryStream and then copy MemoryStream to FileStream in one go,
- why? to minimise writes to file (IO) / minimise write time to file if it is a shared file for example
- multiple sources merged into a single stream?
using (MemoryStream ms = new MemoryStream()) { StreamWriter writer = new StreamWriter(ms); writer.WriteLine("asdasdasasdfasdasd"); writer.Flush(); //You have to rewind the MemoryStream before copying ms.Seek(0, SeekOrigin.Begin); // or ms.Position = 0; using (FileStream fs = new FileStream("output.txt", FileMode.OpenOrCreate)) { ms.CopyTo(fs); fs.Flush(); } }
Layering of Streams (Streams created on streams)
Layering of stream classes for operations such as the following - :
buffering
- eg) NetworkStream to connect to network resource → BufferedStream over NetworkStream
- A buffered stream object creates an internal buffer, and reads bytes to and from the backing store in whatever increments it thinks are most efficient. It will still fill your buffer in the increments you dictate, but your buffer is filled from the in-memory buffer, not from the backing store. The net effect is that the input and output are more efficient and thus faster. A BufferedStream object is composed around an existing Stream object that you already have created.
var t1 = Stopwatch.StartNew(); // Use BufferedStream to buffer writes to a MemoryStream. using (MemoryStream memory = new MemoryStream()) using (BufferedStream stream = new BufferedStream(memory)) { // Write a byte 5 million times. for (int i = 0; i < 5000000; i++) { stream.WriteByte(5); } } t1.Stop(); Console.WriteLine("BUFFEREDSTREAM TIME: " + t1.Elapsed.TotalMilliseconds); t1.Restart(); // Use MemoryStream directly with no buffering. using (MemoryStream memory = new MemoryStream()) { // Write a byte 5 million times. for (int i = 0; i < 5000000; i++) { memory.WriteByte(5); } } t1.Stop(); Console.WriteLine("MEMORYSTREAM TIME: " + t1.Elapsed.TotalMilliseconds);
compress /encrypt data to a file - byte by byte
eg) FileStream to destination file → GZipStream over FileStream → source fileStream.CopyTo(GZipStream)
- Compress
var uncompressedFilebytes = File.ReadAllBytes(@"d:\temp\a_links.txt"); using (FileStream fs = new FileStream(@"d:\temp\a_links.txt.gz", FileMode.CreateNew)) using (GZipStream zipStream = new GZipStream(fs, CompressionMode.Compress)) { zipStream.Write(uncompressedFilebytes, 0, uncompressedFilebytes.Length); }
- Decompress
var compressedFileStream = File.OpenRead(@"d:\temp\a_links.txt.gz"); using (FileStream fs = new FileStream(@"d:\temp\a_links.txt.gz.txt", FileMode.CreateNew)) using (GZipStream zipStream = new GZipStream(compressedFileStream, CompressionMode.Decompress)) { zipStream.CopyTo(fs); }
So you can cross the streams in .NET but be careful!
Benefits of streams
Incremental data processing - don’t need to load everything into memory
Abstraction of backing store - don’t need to know how it works it’s just a stream
Flexibility / Control - low level binary file operations
Random access / seeking - access any part of a file (performant)
Composability / pipelines - chain multiple streams together to perform additional processing e.g. encryption, compression etc
Don’t forget to use System.IO.Abstractions
and System.IO.Abstractions.TestingHelpers
(https://www.nuget.org/packages/System.IO.Abstractions) - to allow you to mock everything stream related - TEST, TEST, TEST :-)
OS project/utility of the week
NirSoft web site provides a unique collection of small and useful freeware utilities all of them developed by Nir Sofer
Ivan Reitman OC (October 27, 1946 – February 12, 2022) was a Czechoslovak-born Canadian film and television director, producer and screenwriter. He was best known for his comedy work, especially in the 1980s and 1990s. He was the owner of The Montecito Picture Company, founded in 1998.
Films he directed include Meatballs (1979), Stripes (1981), Ghostbusters (1984), Ghostbusters II (1989), Twins (1988), Kindergarten Cop (1990), Dave (1993), and Junior (1994). Reitman also served as producer for such films as Animal House (1978), Beethoven (1992), Space Jam (1996), and Private Parts (1997).