Overview

Join us in Clemson, South Carolina, for a cutting-edge undergraduate research experience in data-intensive computing!

Data-intensive research is characterized by the need to efficiently acquire, store, transmit, manipulate, visualize, search, and analyze massive data sets. In recent years, investment in large-scale high-performance computing infrastructure has enabled an exciting opportunity to address "Big Data!" problems that are becoming increasingly common in nearly every area of science and technology. Co-funded by the Department of Defense in partnership with the National Science Foundation, our 2016 summer REU program has the following goals:

  • Provide undergraduate research opportunities in wide range of data-intensive computing projects organized by an experienced team of faculty mentors.

  • Provide training in valuable computational tools and techniques that will help students succeed in data-intensive research.

  • Increase students' understanding of cutting-edge research in "Big Data" areas, as well as enthusiasm for continued research at the graduate level.

The program runs for 8 weeks from June 6, 2016 through late July 29, 2016.

All students participating in the REU program will take part in tutorials on tools and techniques that are widely used in data-intensive computing research, as well as on useful professional development topics (e.g., applying to graduate school, writing NSF graduate fellowship applications). Weekly enrichment lectures will be provided by faculty mentors and visiting speakers to showcase the breadth of research opportunities available in the data-intensive computing domain. Undergraduate participants will have the opportunity to interact with incoming graduate students invovlved in a co-located NSF-funded program to help launch their research work. Numerous excursions, social events, and outings are also planned through the program. At the conclusion, students will present their work in a poster session, and funding is also available to help students to travel to regional and national conferences to present their work.

Research Projects

Students will be matched with a faculty mentor at the beginning of the REU program, and each student will participate in a focused research project. We have a large team of experienced faculty mentors working with this REU who supervise research in a broad range of data-intensive computing areas, from algorithms and data mining/analytics to high-performance compuing platforms to the software infrastructure required to support data-intensive applications. Research mentors include the following faculty:

Mentor Research Interests
Amy Apon Large-scale data analytics, high-performance computing
Sez Atamturktur Simulation of complex systems, model validation analytics
Brian Dean Algorithms, optimization, data mining, medical informatics
Rong Ge Parallel and distributed systems, advanced architecture, green computing
Feng Luo Big data analytics
Hongxin Hu Security, networking, systems
Brian Malloy Software engineering, graphics and visualization
Jim Martin Wireless networking, communication, mobile devices
Linh Ngo Data science tools and infrastructure
Ilya Safro Algorithms, scientific computing, network science
Jacob Sorber Mobile systems, sensor networks, pervasive computing
Pradip Srimani Parallel and distributed computing
James Wang Biological applications of data mining

Here are just a few examples of some of the data-intensive research projects students might have the opportunitiy to join:

  • Green Computing (Rong Ge): Power and energy consumptions are critical concerns for a variety of computing devices from cellphones to warehouse scale computers. While power aware hardware technologies are mandatory for power reduction and energy savings, system software must optimally manage power to meet different demands from applications for power efficiency. REU participants will have the chance to work on green computing on smart cellphones or heterogeneous computers accelerated with graphics processing units.

  • Data-Driven Cyberbullying Detection (Hongxin Hu): Students will help identify new person and situational factors associated with visual cyberbullying and design a cross-feature classifier for automatic visual cyberbullying detection for emerging mobile social networks. Students will also help build an adaptive cyberbullying intervention system to continuously monitor situation changes in cyberbullying and provide specific response strategies for each associated participant.

  • Validation of Complex Models (Sez Atamturktur): Computer simulations are routinely executed to predict the behavior of complex systems in many fields of engineering and science. These computer-aided predictions involve the theoretical foundation, numerical modeling, and supporting experimental data, all of which come with their associated errors. A natural question then arises concerning the validity of computer model predictions, especially in cases where these models are executed in support of high-consequence decision making. We will work on laying out a methodology for quantifying the degrading effects of incompleteness and inaccuracy of the theoretical foundation, numerical modeling, and experimental data on the computer model predictions. As a result of our study, the validity of model predictions will be judged and communicated between involved parties in a quantitative and objective manner.

  • Medical Informatics (Brian Dean): Students will explore large-scale data-driven research in the domain of biological and medical informatics. For example, an ongoing collaboration with neurologists at the Medical University of South Carolina is investigating the use of advanced signal processing and machine learning algorithms for detecting signs of epilepsy in massive EEG (brain wave) datasets, and another project involves the use of network analysis algorithms to characterize neural connectivity patterns in autistic individuals.

  • Data Science Tools and Infrastructures (Linh Ngo): I am part of the Data Science team of the Clemson Cyberinfrastructure Technology Integration group, with research interests in the investigation, development, optimization, evaluation, implementation, and deployment of next-generation data-intensive computing tools and infrastructures. We are also interested in the ingestion, curation, and integration of large-scale complex data sets and streaming data.

Many other project opportunities exist, and you should feel welcome to contact the individual faculty members listed above if you have specific questions about projects they might be leading.

Application Details and Instructions

The REU program provides all student participants with low-cost on-campus housing (if needed), a generous stipend, library access, membership in on-campus recreational facilities, and assistance with travel costs to/from Clemson.

Application Requirements:

  • Applicants must be a U.S. citizens or permanent residents.

  • Applicants must be undergraduate in good standing at their home institions, with plans to complete their degree program.

  • Students must be willing to work a minimum of 40 hours per week and take part in all REU activities (e.g., bi-weekly tutorials, enrichment lectures, excursions, poster sessions and presentations) in addition to their mentored research work.

To apply for the Clemson REU in data-intensive computing, please fill out the on-line application form available here. (this form is the "common REU application" developed at UNC-Charlotte for computing REU programs, so you may have encountered it with previous REU applications as well). In addition to this application, at least one recommendation is required from a faculty member at your institution. The faculty member should complete this form and email it directly to Dr. Brian Dean (bcdean@clemson.edu) with the subject line REU RECOMMENDATION.

Application review will begin in March, 2016. Applications will be accepted until all positions are filled.

If you have any questions about the details of this program, please feel welcome to contact the program director, Dr. Brian Dean (bcdean@clemson.edu).

About Clemson and its School of Computing

Located in the college town of Clemson in scenic upstate South Carolina near Lake Hartwell and the Blue Ridge Mountains, Clemson University is a public research university with a student population of approximately 22,000 students. Nearby cities include Anderson and Greenville, SC, and both Charlotte and Atlanta are about 2 hours away by car.

The School of Computing at Clemson is home to several hundred undergraduate and gradute students and roughly 40 faculty in three divisions: computer science, visual computing, and human-centered computing. Thanks to recent investments in high-performance computing, Clemson computing researchers now have access to 20,000+-core supercomputer called the "Palmetto Cluster" running at nearly 400 teraflops, as well as a distributed "Condor" grid of thousands of CPUs across campus, and extensive resources for parallel computing on Graphics Processing Units (GPUs).

The site is funded by the Department of Defense in partnership with the NSF REU program, as well as Clemson University.

Previous REU programs in the Clemson School of Computing: