CMPT 225 Lecture Notes - Virgin Records, Terabyte, Binary Logarithm
Document Summary
Use 3 main memory pages: 2 input buffers, 1 output buffer. Even output buffer holds 4 records, but just writes and then next elements are added. Output buffer is full, so write out buffer to file. After you deal with an input buffer, get rid of that data, read in next block from that run. Will need to keep track of where you are in file in some way. When finished, have sorted run of size 16. Cost: n = number of records in file = 10,000,000. N = number of blocks (pages) in file = 1,000,000 (10 records per page) B = number of main memory pages (frames) = 100. Sort each page individually: 2n -> have to read entire file, have to read out. Merge pairs of sorted runs: 2n*log2(10,000,000) = 40n. Total: 42,000,000 disk accesses x 10ms = 420,000s. Total: ~230,000,000 main memory accesses x 50ns = 21. 5s.