Course Content (Syllabus)
Characteristics of computing systems for the very-edge, edge, and cloud. Characteristics of computational and memory intensive algorithms (data analytics, machine learning, deep learning algorithms). Frameworks for analyzing computing requirements (simulators, profiling techniques, cloud-based tools and methods). Design objectives for the edge.
Dependability and fault-tolerance: Definition of the terms reliability, availability, maintainability, security and performance. Errors, faults, and failures. Soft and hard errors. Fault models. Redundancy in space and time. Redundancy in software and hardware. Graceful degradation. Applications. Power aware computing: Dynamic and static power. Moore law and Dennard law. Low power processors and systems. Dynamic frequency and voltage scaling (DVFS). DVFS for IoT and edge-level processors. Approximate computing. Project in analyzing the power consumption in a (smartphones):power models based on performance counters.
Real-time responsiveness and worst-case execution time: Operating systems and processes in devices with limited resources. Performance estimation and memory management in ΙοΤ and edge-level devices. Project in using the yocto tool for building Linux kernels.
Edge processing solutions (the portfolio of ARM): Embedded processors. Embedded hardware systems. Basic princicples of increasing the instruction level parallelism (Superscalar, VLIW, SIMD, MIMD, dynamic execution, memory hierarchies, hyperthreading, SMT processors, multicores).
Code optimization techniques and single and multicore architectures. Overview of static and dynamic optimizations techniques. Code level optimizations: Instruction dependencies, loop level transformations, statics branch prediction, data-level transformations. Improving the locality of references and number of misses (caches, TLBs). Programming techniques for multiprocessors and multithreaded architectures.
Parallel architectures for machine learning kernels. Parallel architectures, parallel systems, accelerators, application specific processors for machine learning kernels. Parallelization of machine learning applications in general purpose processors and accelerators.
Keywords
Edge computing, Computional an memory requirements, Profiling, Reliability, Redundancy, Low power computing, Dynamic voltage and frequency scaling, Dynamic and static power, Real-time response, Worst-case execution time, Embedded processors, Code level transformations, Parallel computing for machine learning applications
Additional bibliography for study
1) Parallel Computer Organization and Design, Michel Dubois, Murali Annavaram, Per Stenstrom, 2012.
2) Reconfigurable computing, Scott Hauck, André DeHon, Morgan Kauffman.
3) Parallel Computing for Data Science: With Examples in R, C++ and CUDA, Norman Matloff, Chapman and Hall/CRC The R Series, 2016.
4) Scaling up machine-learning: Parallel and Distributed Approaches, Ron Bekkerman, Mikhail Bilenko, John Langford, Cambridge University Press, 2018.