Short intro about myself

I am a member of the Advanced Computing Systems Group under the direction of Sanjay J. Patel. Our work focuses on reliable and high performance computing. My personal interests are in the modeling of future microarchitectures, specifically hardware-based dynamic optimization systems. My other areas of interest include compilers and operating systems.

Longer version...

I began attending the University of Illinois at Urbana-Champaign in August 1998 (this was after I spent two years at Freed-Hardeman University, a Christian college). As with most undergraduates, I did not really have a technical focus during undergrad. I began graduate school in August 2000. I immediately began working on research with Sanjay Patel and Steve Lumetta in the area of computer architecture. Sanjay had a great idea for dynamically optimizing programs using a hardware-based optimizer called rePLay. We experimented with that idea and turned it into a paper which I presented at the International Symposium on Microarchitecture, Dec 2001. I based most of my Master's degree thesis off of this work. At the time, we did not really know the performance limitations of dynamic optimization, so I embarked on a study to understand the limitations of dynamic optimization. Our findings showed that dynamic optimization is, in general, limited to performance improvements under 2x. This finding was rather disconcerting to the dynamic optimization community, and thus we had to settle for only filing a technical report on the findings (rather than a full publication). Following this work, we began to question the performance opportunities of performing optimization inside the processor pipeline before instructions execute. We turned our experimentation into a publication that I presented at the International Symposium on Computer Architecture, June 2005. In our work on Continuous Optimization, we found that most traditional low-level compiler optimizations can be reduced to table-based optimization hardware that can be placed directly in the processor pipeline. The primary benefit of this type of optimization is to reduce resource contention in the out-of-order processor pipeline and absorb data cache miss stalls. During our work on rePLay, we proposed a hardware-based optimizer, but we did not really delve into the implementation details. We have recently begun looking at using the Continuous Optimization hardware as the heart of a rePLay-style hardware-based optimizer. With only minor modifications, the Continuous Optimization hardware works very well in this capacity. We have also found that using the Continuous Optimization hardware in both respects simultaneously can provide a better performance improvement than either separately.