Kevin Skadron, Mircea Stan, Westley Weimer and Ahmed Abbasi receive $875,000 from NSF

EN-CS XPS:FULL: New Abstractions and Applications for Automata Computing

ABSTRACT
As society collects more and more data about the world around us, and digitizes more and more artifacts, “big data” promises unparalleled potential, but also poses new and unique computational challenges. Turning data into useful knowledge at or near real-time can have significant impacts, such as enabling timely intervention in healthcare and fast response in cybersecurity. As technology constraints limit CPU performance, researchers and practitioners are increasingly looking to specialized processors to accelerate data analytics. The ability to extract patterns from unstructured data is an especially important task. This research project carries out a cross-stack investigation to evaluate the effectiveness of the automata computing paradigm to accelerate pattern mining of unstructured data.

Specifically, by leveraging the industry’s new Automata Processor (developed by Micron Technology), this project is (1) developing benchmark suites of truly diverse automata for performance comparison of real and simulated, existing and future automata engines, (2) developing new tools, including programming languages, systems, and architectural enhancement to make automata computing intuitive and easy to adopt, (3) evaluating automata computing solutions to address real-world big-data applications, and (4) developing a set of educational and community-building activities to maximize the broader impact of the project outcome. Successful implementation of this project enable new automata-based abstractions to shed light on the performance of AP technology for various applications, such as pattern mining. This project will build the the intellectual foundations to support and catalyze research, education, training, and adoption of automata-based solutions to address big-data challenges in industry, government, and society.

John Stankovic receives $425,000 from NSF

CPS: Breakthrough: Wearables With Feedback Control

ABSTRACT

Recently there is an increasing availability of smart wearables including smart watches, bands, buttons and pendants. Many of these devices are part of human-in-the-loop Cyber Physical Systems (CPS). With future fundamental advances in the intersection of communications, control, and computation for these energy and resource limited devices, there is a great potential to revolutionize many CPS applications. Examples of possible applications include detecting and controlling hand washing to prevent transmission of infections or bacteria, monitoring and using interventions to keep factory workers safe, detecting activities in the home for monitoring the elderly, and improving rehabilitation of stroke victims via controlled exercises. However, to date, much of the work on wearables concentrates on only sensing, collecting and presenting data. For use in CPS it is necessary to consider the increased use of new sensing modalities, to apply feedback to close the control loop, and to focus on the fundamental issues of how both the environment and human behavior affect the cyber. In particular, since humans are intimately involved with wearables it is necessary to increase understanding of how human behaviors affect and can be affected by the control loops and how the systems can maintain safety.

This work develops generic underlying algorithms for processing smart wearable data rather than one-off solutions, it extends the understanding and control of wearable systems by addressing humans-in-the-loop behaviors, and it explicitly focuses on the impact of the environment and human behavior on the cyber. Novel ideas are proposed for each of these areas along with a structure for their integration. For example, the algorithmic approach to support more robust, accurate and efficient activity recognition using wearable devices is based on five fundamental concepts: (i) Direction Agnostic Modeling, (ii) Direction Aware Modeling, (iii) Spatial Reachability, (iv) Spatiotemporal Segmentation, and (v) Dynamic Space Time Warping. For dealing with humans-in the-loop behaviors, Model Predictive Control (MPC) is extended to semantic based MPC. This solves control problems that are not amenable to electromechanical laws and employs machine learning. Many CPS projects do not explicitly address how the uncertain world affects how the cyber must be developed in order to perform robustly and safely. A new *-aware software development paradigm focuses on physical-cyber CPS issues as central tenets and that serves as an integrating platform for all the proposed work. The *-aware paradigm focuses on how software must be made robust to handle the physical world, while meeting safety and adaptability requirements

Kamin Whitehouse & Hongning Wang receive $425,000 from NSF

Collaborative Sensing: An Approach for Immediately Scalable Sensing in Buildings

ABSTRACT

Buildings are complex systems with profound impact on human health, productivity, comfort, and energy consumption. Smart building technology promises to improve many aspects of building operation by applying sensor data toward more informed and precise building operation. Smart buildings are one important dimension of enabling sustainable Smart Cities. One of the challenges in smart buildings is the selection, placement, and installation of multiple sensors in the building. This can be both an expensive and time consuming process. Poor placement of sensors can have a significant adverse impact on the ability to obtain energy savings. This research project aims to improve the scalability of smart building applications by developing new techniques called collaborative sensing that estimate the sensor data of one building based on sensor data collected in other buildings. The technique exploits patterns in sensor data that result from common patterns in the design and construction of buildings. If successful, this technique will create a fundamental shift in the scalability of smart building applications, such that they can be applied to a new building without the need to install new instrumentation. Additionally, the underlying mathematical techniques will generalize to other aspects of the built environment where patterns in design, construction, or usage create patterns in sensor data.


Smart building technology promises to improve many aspects of building operation by collecting and analyzing sensor data to support informed and precise building operation. However, adoption of smart building applications is inhibited by the fact that new sensors must be installed in every building, and that optimization of sensor placement may be difficult and require significant experimentation and effort. This research project develops an innovative approach based on the notion of collaborative sensing. In this approach the sensor data of one building is estimated based on sensor data collected in other buildings. The basic premise is that common design and construction patterns for buildings create a repeating structure in their sensor data. Thus, a sparse sensing basis can be used to represent sensor data from a broad range of buildings. A model of a building can be constructed from this sensing basis using only a small amount of data, such as utility meter readings, climate zone, and square footage. This low-dimensionality model can then be used to reconstruct sensor data for the building based on high-fidelity data collected in other buildings. This approach aims to create a shift to a new paradigm in which smart building functionality can be applied to new buildings without the need to install specialized instrumentation. Preliminary testing using publicly available sub-metering data from 100’s of buildings indicate that this approach is not only more scalable but also sometimes more accurate than state-of-the-art alternatives. If successful, this research will create a fundamental shift in the scalability of smart building applications. The underlying mathematical techniques will generalize to other aspects of the built environment where patterns in design, construction, or usage create patterns in sensor data. These techniques will be encapsulated in a Web service that allows people anywhere in the world to apply the proposed techniques to their own building. The project will contribute to the National Science Foundation’s dual missions of research and education, and both graduate and undergraduate researchers will be involved in all phases of this research.

Yanjun (Jane) Qi, David Evans and Wes Weimer receive $494,884 from NSF

TWC: Small: Automatic Techniques for Evaluating and Hardening Machine Learning Classifiers in the Presence of Adversaries

ABSTRACT

New security exploits emerge far faster than manual analysts can analyze them, driving growing interest in automated machine learning tools for computer security. Classifiers based on machine learning algorithms have shown promising results for many security tasks including malware classification and network intrusion detection, but classic machine learning algorithms are not designed to operate in the presence of adversaries. Intelligent and adaptive adversaries may actively manipulate the information they present in attempts to evade a trained classifier, leading to a competition between the designers of learning systems and attackers who wish to evade them. This project is developing automated techniques for predicting how well classifiers will resist the evasions of adversaries, along with general methods to automatically harden machine-learning classifiers against adversarial evasion attacks.


At the junction between machine learning and computer security, this project involves two main tasks: (1) developing a framework that can automatically assess the robustness of a classifier by using evolutionary techniques to simulate an adversary’s efforts to evade that classifier; and (2) improving the robustness of classifiers by developing generic machine learning architectures that employ randomized models and co-evolution to automatically harden machine-learning classifiers against adversaries. Our system aims to allow a classifier designer to understand how the classification performance of a model degrades under evasion attacks, enabling better-informed and more secure design choices. The framework is general and scalable, and takes advantage of the latest advances in machine learning and computer security.

Connelly Barnes, Baishakhi Ray and Westley Weimer receive $450,000 from NSF

Translating Compilers for Visual Computing in Dynamic Languages

ABSTRACT
This collaborative project is developing technologies to enable students, scientists, and other non-expert developers to use computer languages that facilitate rapid prototyping, and yet still automatically convert such programs to have high performance. In this research, the PI and co-PIs focus on programs that operate over visual data, such as programs in computer graphics, computer vision, and visualization. Visual data is important because visual datasets are rapidly growing in size, due to the use of cell-phone cameras, photo and video sharing online, and in scientific and medical imaging. The intellectual merits are that specialized program optimizations are being developed specifically for visual computing and for languages that enable rapid prototyping, alongside techniques that allow the computer to automatically search through different candidate optimizations and choose the fastest one. The project’s broader significance and importance are that it will make the writing of computer programs that operate over visual datasets more accessible to novice programmers, make visual computing more accessible to a broader audience, permit faster research and development over visual programs, and make such programs themselves be more efficient.

More specifically, this research program is producing translating compilers that are specialized to handle programs that compute over visual data. The group led by the PI is researching new compilers that translate code from dynamic languages into highly efficient code in a target language. Dynamic languages are defined as those with a very dynamic run-time model, for example, MATLAB, Python, and Javascript. The target language is a language such as C that permits implementation of highly efficient programs. This research framework incorporates ideas from compilers, graphics, computer vision, visual perception, and formal and natural languages. The research will make a number of key intellectual contributions. First, new domain-specific translations and optimizations for visual computing will be formalized into manual rules that can be applied to any input program. Second, the team will research a novel approach of automatically learning translations, instead of using manually-coded rules. This can take the form of learning translation “suggestions” from humans, who can interactively suggest better output code. Third, a new search process based on offline auto-tuning will be used to select the translations that result in the fastest program. The success of the project will be verified against a comprehensive test suite of programs from computer vision and graphics.

Baishakhi Ray receives $250,000 in NSF funding

TWC: Small: Collab: Automated Detection and Repair of Handling Bugs in SSL/TLS Implementations

ABSTRACT
Secure Sockets Layer (SSL)/Transport Layer Security (TLS) protocols are critical to internet security. However, the software that implements SSL/TLS protocols is especially vulnerable to security flaws and the consequences can be disastrous. A large number of security flaws in SSL/TLS implementations (such as man-in-the-middle attacks, denial-of-service attacks, and buffer overflow attacks) result from incorrect error handling. These errors are often hard to detect and localize using existing techniques because many of them do not display any obvious erroneous behaviors (e.g., crash, assertion failure, etc.) but they cause subtle inaccuracies that completely violate the security and privacy guarantees of SSL/TLS. This project aims to improve error handling mechanisms in SSL/TLS implementations by building novel tools that reduce developer effort in writing and maintaining correct error handling code while making SSL/TLS implementations more secure and robust.

This project develops a framework for improving the robustness of error handling code in SSL/TLS implementations. The framework has three main objectives. First, error specifications for different SSL/TLS functions are automatically inferred to learn how they communicate the failures. Next, the inferred specifications are used to build a tool for automatically detecting error handling bugs. Finally, the framework also provides new program repair tools that can automatically fix the detected bugs. Therefore, the framework provides end-to-end assistance in maintaining error-handling code in SSL/TLS implementations and thus significantly improves internet security.

Samira Khan receives $174,803 from NSF

System-Level Detection, Modeling, and Mitigation of DRAM Failures to Enable Efficient Scaling of DRAM Memory

Future computing systems will be dominated by enormous amount of data processing. These systems will have to compute over exponentially growing user focused data from ubiquitous network and internet of things (e.g., sensors, self-driving cars, mobile devices, social media). At the same time, the forward progress of scientific innovations will greatly depend on the fast and efficient computation on high volume datasets generated from scientific experiments (analyzing gravitational waves, colliding particles in the particle accelerators, etc.). However, the current computing systems are bottlenecked by memory, but high capacity, scalable memory is essential for fast and efficient data processing in the future. Unfortunately, DRAM, the predominant underlying technology for memory is facing major scaling challenge. As DRAM scales down to smaller technology nodes, cells become more vulnerable, resulting in DRAM failures. Enabling a higher capacity memory system without sacrificing reliability is a major research challenge.

This research focuses on developing fundamental breakthrough that can enable scalable memory system for the future systems. This proposal provides research plan and ideas to solve the DRAM scaling challenge in a completely new approach by separating the responsibility of providing reliable DRAM operation from designing memory cells with smaller feature size. The central vision of this proposal is to develop system-level detection and mitigation techniques for DRAM failures such that cells can be manufactured to be smaller without providing any reliability guarantee. It is expected that ideas developed in this research will bridge the gap between circuits and systems and will enable a holistic approach to solve the DRAM scaling challenge. The cross-cutting nature of the work will influence circuit-level testing, computer architecture, and OS and systems design and can potentially enable collaboration between different communities (testing and systems/architecture). The ideas developed in this research will not only impact innovation in computing, but will also help numerous scientific fields to take a leap towards new innovations. The results of this research will be integrated to existing and new courses to impact student training and education, designed focusing on attracting the minority groups towards hardware and systems design to enhance diversity in the field.

Mark Sherriff receives $104,173 from NSF

Collaborative Research:Transforming Computer Science Education Research Through Use of Appropriate Empirical Research

ABSTRACT
The significance and importance of this project resides in its focus on increasing the value and effectiveness of Computer Science Education (CSEd) by providing computer science educators with the tools and the mentoring necessary to properly evaluate novel teaching methods. As the demand for computer science graduates increases, educators must effectively educate students at scale, which requires innovation in teaching and learning techniques. This project will help move the CSEd community from reflective teaching to the Scholarship of Teaching and Learning by increasing education research rigor and replication frequency. As a result, the computer science community will build theories and practices about how best to educate students to meet workforce needs for computer science talent. In the context of the NSF taxonomy of Education Research this project resides in the Foundational Research category, i.e., one that develops innovations in technology that will influence and inform research and development in different contexts.

The goal of this project is to transform empirical CSEd research by creating and supporting a community of CSEd researchers through: (1) creation and curation of laboratory packages to facilitate empirical CSEd research; (2) facilitation of cohorts of 10-12 educators who are mentored in developing and executing an empirical CSEd research study; and (3) development and presentation of tutorials on empirical research methods at computer science and, in particular, CSEd conferences. Laboratory packages are aids that provide researchers with a research question, a methodology for designing and executing a study, tools and resources to replicate the study, and results of previous related studies. The faculty cohorts using these laboratory packages will have a more-focused interaction during a summer session to develop their particular education research study with a follow-up workshop to report and discuss results. Finally, the tutorials and workshop results will allow for broader dissemination of the key concepts of empirical CSEd research to the larger computer science community.