Mining from Massive Datasets

Course Information
TitleΕξόρυξη Γνώσης από Μεγάλα Σύνολα Δεδομένων / Mining from Massive Datasets
CodeDWS203
FacultySciences
SchoolInformatics
Cycle / Level2nd / Postgraduate
Teaching PeriodSpring
CoordinatorApostolos Papadopoulos
CommonYes
StatusActive
Course ID600016262

Programme of Study: PMS EPISTĪMĪ DEDOMENŌN KAI PAGKOSMIOU ISTOU (2018 éōs sīmera) MF

Registered students: 1
OrientationAttendance TypeSemesterYearECTS
KORMOSElective Courses belonging to the selected specialization217.5

Programme of Study: PMS EPISTĪMĪ DEDOMENŌN KAI PAGKOSMIOU ISTOU (2018 éōs sīmera) PF

Registered students: 16
OrientationAttendance TypeSemesterYearECTS
KORMOSElective Courses belonging to the selected specialization217.5

Class Information
Academic Year2018 – 2019
Class PeriodSpring
Faculty Instructors
Weekly Hours3
Total Hours39
Class ID
600132037
Type of the Course
  • Scientific Area
  • Skills Development
Course Category
Specific Foundation / Core
Mode of Delivery
  • Face to face
Language of Instruction
  • Greek (Instruction, Examination)
  • English (Instruction, Examination)
Learning Outcomes
1. Students will get important knowledge in big data management and analytics 2. They will work in teams 3. They will be more confident by presenting their work in class 4. They will get in contact with modern big data analytics techniques with a lot of applications in Industry.
General Competences
  • Apply knowledge in practice
  • Retrieve, analyse and synthesise data and information, with the use of necessary technologies
  • Work autonomously
  • Work in teams
  • Generate new research ideas
Course Content (Syllabus)
Introduction to Big Data Management and Analytics - Hadoop: basic and advanved topics - The Hadoop ecosystem: HDFS, Hbase, Pig, Hive - NoSQL databases - Theoretical issues in MapReduce - The Scala programming language - The Spark platform: basic and advanced issues - Streaming, SQL, Machine Learning, GraphΧ: the basic libraries - Data exploration using SparkR - Algorithm design in Spark - Graph databases - Other systems: Giraph, GraphLab, Hama, BlinlkDB
Keywords
big data, data management, data mining from big data, big data analytics
Educational Material Types
  • Slide presentations
  • Book
Use of Information and Communication Technologies
Use of ICT
  • Use of ICT in Course Teaching
  • Use of ICT in Laboratory Teaching
  • Use of ICT in Communication with Students
  • Use of ICT in Student Assessment
Course Organization
ActivitiesWorkloadECTSIndividualTeamworkErasmus
Lectures39
Reading Assigment100
Project55
Written assigments32
Total226
Student Assessment
Student Assessment methods
  • Written Exam with Extended Answer Questions (Summative)
  • Written Assignment (Formative, Summative)
  • Performance / Staging (Formative, Summative)
  • Written Exam with Problem Solving (Summative)
  • Report (Formative, Summative)
Bibliography
Additional bibliography for study
H. Karau, A. Konwinski, P. Wendell, M. Zaharia: Learning Spark, O' Reilly, 2015. N. Lynch: Distributed algorithms, Morgan Kaufmann, 1996. I. Robinson, J. Webber, E. Eifrem: Graph databases, O' Reilly, 2013. S. Ryza, U. Laserson, S Owen, J. Wills: Advanced analytics with Spark, O'Reilly, 2015. R. Schutt, C. O'Neil: Doing Data Science, O' Reilly, 2014. C.A. Varela, G. Agha: Programming distributed computing systems: a foundational approach, The MIT Press, 2013.
Last Update
31-01-2020