Recently at work I had a task of developing a small tool that could display charts with progress of our data extraction over time.
After each iteration, results of data extraction were stored in several directories and compared against a handwritten reference set.
In short, this is the directory structure:
|--+ batch1
|  |--+ correct
|  |  `--- data.txt
|  |--+ guessed
|  |  |--- data.00000145a8ab68b9.txt
|  |  |--- data.00000145a92a530f.txt
|  |  `--- data.0000014594039f5b.txt
|  |--- input1.pdf
|  |--- input2.pdf
|  `--- input3.pdf
`--+ batch2
   |--+ correct
   |  `--- data.txt
   |--+ guessed
   |  |--- data.00000145a8ab68b9.txt
   |  |--- data.00000145a92a530f.txt
   |  `--- data.0000014594039f5b.txt
   |--- input4.pdf
   |--- input5.pdf
   `--- input6.pdf
The hexadecimal numbers are timestamps in milliseconds since Unix epoch.
Each file consisted of records in the format:
input1.pdf value from the first file
input2.pdf value from the second file
input3.pdf value from the third file
Long story short, there was no difference what programming language I would write the tool in, so I picked Haskell. Let’s ignore most details of the implementation. What’s important for this entry, is that I implemented the following functions:
allCorrectFiles :: [FilePath]  -> IO [FilePath]
allGuessedFiles :: [FilePath]  -> IO [(LocalTime, FilePath)]
readDataFile :: FilePath -> IO [(Entry, String)]
getAllData :: [FilePath] -> IO ([(Entry, String)], [(LocalTime, Entry, String)])
 
The initial implementation of the getAllData function was straightforward, yet a bit clunky:
getAllData subDirs = do
	correctFiles <- allCorrectFiles subDirs
	guessedFiles <- allGuessedFiles subDirs
	correctData <- mapM readDataFile correctFiles
	guessedData <- forM guessedFiles $ \(t,f) ->
		x <- readDataFile f
		return $ map (\(e,s) -> (t,e,s)) x
	return (concat correctData, concat guessedData)
 
This code is quite ugly. The especially jarring were the return $ map combination and concats in the final line.
After a while, I noticed that all the functions I call from the getData function are of type a -> IO [b]. A double monad. So, I added transformers library to my project and rewritten that function as: