I began working with Doctor Fukuda and the rest of the Multi-Agent Spatial Simulation (MASS) Library developers in June 2015. My research involved designing, implementing, and testing new parallel input and output (I/O) functionality within the MASS Library. My parallel I/O API is currently being used by a Master's student to read and analyze large climate datasets. Feel free to view my letter of recommendation from Doctor Fukuda, the Computing and Software Systems Chair at the University of Washington Bothell.
The University of Washington Climate Analysis (UWCA) is an application that uses the MASS Library to analyze and predict climate change by reading large amounts climate data from NetCDF files - files created specifically for storing scientific data. UWCA reads from a massive set of climate data (20GB+) that requires a lot of memory space for faster analysis. More memory space requires using multiple computers to run the application. Slow read performance caused by sending the data from the master computer to the slave computers, creates a bottleneck increasing computation time. A solution to this problem is to read the climate data directly to the slave computers. The proposed solution of Implementing parallel I/O within the MASS library will increase the NetCDF file read performance within UWCA, while simultaneously enhancing the I/O capabilities of the MASS library.
MASS implements a distributed space, called “Places," where multiple autonomous analyzing programs, called “Agents,” can move from “Place” to “Place." The approach to parallel I/O is to allow each “Place” to open, read, write, and close the same file in parallel. To open or close a file, just one “Place” must perform either the open or close operation. To read a file, the first “Place” to read must read the entire file into main memory (a buffer), and the following “Places” must read from the buffer. To write to a file, all “Places” must write to main memory (a buffer) and the last “Place” to write must write the entire buffer to the file.
Please refer to the following slides to learn about MASS Library parallel I/O in more detail:
MASS Parallel I/O can be used to increase read speeds for larger files (4GB+) and allows users to read files using multiple computing nodes quickly.
MASS Parallel I/O supports both NetCDF and TXT file types, and can be easily be extended to support more types.
Currently, the parallel I/O API is being used by a Master's student to re-implement UWCA and read and analyze large climate datasets.
We will be working on a research paper with plans to be published once UWCA has been re-implemented.
Further Information:If you would like to learn about my early research in more detail, then click on the following links to view my quarterly reports:
Summer 2015 Report |
Autumn 2015 Report |
Winter 2016 Report
If you would like to find out more about MASS then visit the MASS homepage.
All MASS source code is in a private Bitbucket repository; please message me directly if you would like to see my code. The Computing and Software Systems department at the University of Washington Bothell will release the MASS Library to the public soon.