Fundamentals of Bioinformatics 2
Module PRINCIPLES OF BIOINFORMATICS

Academic Year 2025/2026 - Teacher: ELISABETTA SCIACCA

Expected Learning Outcomes

Knowledge and understanding

By the end of the course, students know the fundamental principles of programming and the distinction between compiled and interpreted languages. They know the main logical constructs (sequence, selection and iteration) and their representation through flowcharts. They know the basic structure and syntax of the R and Python languages, the operators (arithmetic, relational and logical), the main data types (both primitive and structured). They know what a development environment is and what it is for, and are familiar with RStudio for R and Spyder for Python.

Applying knowledge and understanding

By the end of the course, students are able to translate simple algorithms into working code using operators and control constructs (if, for, while). They can create and manage data structures such as vectors, lists, matrices and data frames, read and write files and represent data graphically. They are able to install and use external packages and to apply programming principles to solve basic problems, working independently in writing, executing and testing elementary scripts.

The following transversal skills contribute to achieving the learning outcomes through the activities indicated.

Making judgements

Students gain awareness of the typical application areas of the two languages presented and develop the ability to assess the correctness of their own scripts. This ability is exercised through computer-based practical sessions and guided problem solving.

Communication skills

Students acquire the ability to document their code (comments, docstrings) and to describe the logic of an algorithm by means of flowcharts and structured linear notation.

Learning skills

Students develop the ability to continue learning new constructs and libraries independently, consulting the documentation of the languages and installing packages from repositories such as CRAN, Bioconductor, PyPI and GitHub.

Course Structure

The module comprises 26 hours of activities, organised as follows:

       14 hours of teacher-led delivery (Didattica Erogativa, DE), devoted to presenting the theoretical content through lectures supported by slides;

       12 hours of interactive teaching (Didattica Interattiva, DI), devoted to computer-based practical sessions in which students write and run scripts in R or Python, applying the concepts presented in class.

Teacher-led delivery is mainly aimed at acquiring knowledge (constructs, syntax, data types), whereas interactive teaching develops applied skills through the writing and testing of code, contributing to the achievement of the learning outcomes related to applying knowledge and understanding and to making judgements.

If the course is delivered in blended or remote mode, appropriate adjustments may be made to the above, in order to ensure consistency with the syllabus.

Required Prerequisites

Not Required. 

Attendance of Lessons

Attendance is mandatory. Active participation in lectures and, in particular, in the computer-based practical sessions is essential to acquire the practical programming skills in R and Python, which can hardly be developed through individual study alone.

Detailed Course Content

Part I - Introduction to programming (course slides).

Definition of algorithm and programming language.

Description of translators and distinction between compiled and interpreted languages.

Basic programming concepts: variables, assignment, data types and types of operators (arithmetic, relational and boolean).

Introduction to the fundamental programming constructs (sequence, selection and iteration).

Flowcharts and Structured Linear Notation.

Exercises on algorithms and flowcharts.

Assessment quiz.

 

Part II - Introduction to the R language (course slides and scripts; reference [1] for further study).

Installation and introduction to R and the RStudio development environment.

Language basics: comments, assigning values to variables, special values, primitive and advanced data types, checking and converting data types.

Syntax of arithmetic, relational and boolean operators.

Syntax of the fundamental constructs (if-else, while, for) and implementation of simple flowcharts.

Definition of functions and related exercises.

Introduction to the vector data type (the concatenation function, the seq() and rep() functions and other utility functions for vectors).

Introduction to the matrix data type and related functions (element extraction, filtering, handling of rows and columns).

Exercises on vectors and matrices.

Introduction to the list data type (the list() function and utility functions, element extraction).

Introduction to the data frame data type (utility functions, element extraction, adding/removing rows and columns, handling NA values).

Exercises on lists and data frames.

Reading and writing files.

Plots in R (line plots, bar plots, pie charts and aesthetic parameters).

Installation of R libraries from CRAN and Bioconductor.

Assessment quiz.

 

Part III - Introduction to Python (course slides and scripts; reference [2] for further study)

Installation and introduction to Python and the Spyder development environment.

Language basics: comments, assigning values to variables, special values, primitive and advanced data types, checking and converting data types, strings and related operations.

Syntax of arithmetic, relational and boolean operators.

Syntax of the fundamental constructs (if-else, while, for) and implementation of simple flowcharts.

Definition of functions and docstrings and related exercises.

The Python standard library.

Introduction to the list data type (creating a list, element extraction and related methods).

Tuples and sets (definition and related methods).

Dictionaries (definition and related methods).

Installation of external modules via pip and installation of pandas and numpy.

Introduction to numpy matrices and related operations. Exercises on matrices.

Filtering matrices using boolean masks and the np.where function.

Introduction to pandas series and related operations.

Introduction to pandas data frames and related operations.

Exercises on series and data frames.

Reading and writing files (CSV format).

Plots in Python with matplotlib (line plots, bar plots and aesthetic parameters).

Assessment quiz.

Textbook Information

The teacher will provide teaching materials on the course's Studium page in the form of:

       slides discussed in class;

       R and Python scripts containing the code of the exercises carried out in class.

This material is the main support for studying the subject; the following texts are recommended for further study:

[1]   R. I. Kabacoff, R in Action: data analysis and graphics with R, Manning.

[2]   C. Walsh, Python for beginners: an essential guide to learn with basic exercises.


Learning Assessment

Learning Assessment Procedures

The learning assessment is unique for the integrated course of Principles of Bioinformatics 2 and consists of a written test, possibly followed by an oral interview.

The written test lasts 60 minutes and comprises:

·       a multiple-choice test of 30 questions, 15 of which relate to the Principles of Computer Science module and 15 to the Principles of Bioinformatics module; each question has only one correct answer and four incorrect ones (correct answer: +1 point; incorrect answer: −0.25 points; no answer: 0 points);

·       two exercises, one for each module, each marked from 0 to 2 points.

Each written test is valid only for the examination session (appello) in which it is taken: the result is not carried over to subsequent sessions and, should the learning assessment not be completed within the same session, the test must be retaken.

The score of the test and that of the exercises together determine the written mark. The written test is passed with a minimum score of 16. The oral interview is compulsory for students obtaining a written score of 16 or 17, while it is optional for those obtaining a score of 18 or higher. The oral interview may only confirm or improve the mark achieved in the written test and can never lower it. Distinction (lode) is awarded to students who exceed the score of 30, adding to the written test the contribution of the exercises or of the oral interview.

As regards the oral interview, the assessment will consider: the relevance of the answers to the questions asked, the appropriate use of technical language, the ability to provide examples and the student's overall communication skills.

Learning assessment may also be carried out on-line, should the conditions require it.

To ensure equal opportunities and in compliance with current laws, interested students may request a personal interview in order to plan any compensatory and/or dispensatory measures based on educational objectives and specific needs. Students can also contact the CInAP (Centro per l'integrazione Attiva e Partecipata — Servizi per le Disabilità e/o i DSA) referring teacher within their department (https://www.cinap.unict.it/content/referenti).

Examples of frequently asked questions and / or exercises

Which of the following data types can contain elements of different types?

  1. Vector
  2. Matrix
  3. List
  4. None of the above
  5. Vector, Matrix and List


What is the correct way to assign a matrix to the variable my_matrix in R?

  1. my_matrix = array([ [1,2], [5,3], [7,8] ])
  2. my_matrix = [ [1,2], [5,3], [7,8] ]
  3. my_matrix = matrix(c(1,2, 5,3, 7,8), nrow=3, ncol=2)
  4. my_matrix = m(c(1,2, 5,3, 7,8), nrow=3, ncol=2)
  5. my_matrix = matrix(1,2, 5,3, 7,8, nrow=3, ncol=2)


What is the assignment symbol in R?

  1. =
  2. <-
  3. <=
  4. ==
  5. ->


In RStudio, which window is used to display the contents of variables and objects loaded into the workspace?

  1. Console
  2. Plots
  3. Script Editor
  4. Environment
  5. Files

EXERCISE

Use a language of your choice (R or Python) to create a function that, given two input numbers, returns the maximum of the two.
Provide an example of calling the function with input numbers chosen by the student.
VERSIONE IN ITALIANO