This page is based on the Osiris course text for Processing Complex Data.
Processing Complex Data
2.5
Contrary to what most introductory data science courses and statistics courses teach and use, data in science has an incredible variety of formats, sizes, and procedures. From simple tables to complex multidimensional space-time arrays, including metadata and custom storage formats, the world of data for science is vast, varied, and wildly interesting.
This course is designed to give students an introduction to core real-world data concepts, as well as hands-on experience with handling, processing, and modelling different types of complex data used in various fields of science and beyond. The course leans on student engagement and guided practical group work to create a dynamic learning environment.
At the end of the course, students will be able to:
Assessment is based on a group project, which runs for the duration of the course. The grade for the project is the final grade for the course.
All course materials will be made openly available under a CC-BY license. The readings will be based on books, articles, and other sources which are openly available.
DIGITA_DATA_INFORMAT
Hoor/werkcollege
The course is structured around four weekly hackathons.
To pass the course, you need to:
Plagiarism and fraud are serious academic offenses. Plagiarism is the use of another person’s work without proper acknowledgment. This includes copying and pasting text from generative AI, the internet, books, or other students. If you use text from another source, you must put it in quotation marks and provide a citation. If you do not, you are committing plagiarism.
Fraud is the use of dishonest methods to gain an unfair advantage. This includes copying another student’s work, submitting work that is not your own, or submitting the same work for two different courses. If you commit fraud or plagiarism, you will fail the course. If you are not sure what constitutes plagiarism or fraud, see the UU fraud and plagiarism policy.
This course follows Scenario B of the UU GenAI index. You may use generative AI to prepare the work you hand in, but you may not use generative AI to produce the assignment that you hand in, except for copy-editing. You may also use AI tools to help generate code that produces reproducible datasets.
The use of generative AI, such as ChatGPT, in the group assignment is allowed only for:
The use of generative AI must be clearly indicated in the assignment, including a link to the full conversation with the tool, either using the share function in the tool or by exporting the conversation to an online document.
The materials in this course are generated by FSBS teaching staff, who hold the copyright. The intellectual property belongs to Utrecht University.
Warning: There is no information in these materials that exceeds legal use of copyright materials in academic settings, or that should not be part of the public domain.
You may use all content in this course, excluding staff names and datasets, as input to generative AI tools, provided that the content is not used for further training of the model.
If you do not know how to prevent the use of the content for further training of the model, you should not use any course materials as input for the AI tool. The same applies if you are not absolutely certain that the content is not used for further training of the model.