Multimodal Artificial Intelligence in Urban Infrastructure Analysis

题目：Multimodal Artificial Intelligence in Urban Infrastructure Analysis

讲者：Roberto M. Cesar Jr. 教授

时间：2024年9月6日上午10:30-11:30

地点：王选所106报告厅

报告摘要：

The growing urbanization and the need to improve modern city infrastructure drive the use of innovative technologies for urban data collection and analysis. This lecture presents a project that employs multimodal artificial intelligence to capture and analyze data on urban infrastructure, focusing on sidewalks and bus stops. Utilizing smartphones equipped with sensors to capture video, audio, accelerometer data, and GPS, the devices are mounted on users' vests and on strategically positioned tripods. The project involves the development of an application for data capture, Jupyter notebooks for analysis and visualization, and a Python library that facilitate data capture and analysis. Research in computer vision is crucial to this project and involves the development of new, efficient methods for optical flow, saliency detection, and video cropping. By applying advanced computer vision techniques and multimodal machine learning, the project aims to extract critical information to improve pedestrian mobility and user experience for public transport accessibility. This work provides valuable insights into how technology can be applied to address contemporary urban challenges and enhance the quality of life in cities.

讲者简介：Roberto is a professor of the University of Sao Paulo (USP) since 1998 (BSc in Computer Science - UNESP - 1991; MSc in Electrical Engineering -UNICAMP - 1993; Ph.D. in Physics - USP/Brazil,IPT-UCL/Belgium - 1997). He is currently a Full-Professor in the Department of Computer Science - IME - USP working in the Data Science Research Group. He is currently special advisor for Physical Sciences and Engineering at the Sao Paulo Research Foundation - FAPESP. He served as the Director of the eScience Research Center at USP and as the head of the Computer Science Department. He was member of the Image and Vision Computing and the Signal, Image and Video Processing editorial boards, chair and invited speaker of conferences and workshops (Sibgrapi 2003, CIARP 2010, Sibgrapi 2011; SHAPES 2.0 - 2012, eSon - IEEE eScience 2013, IEEE eScience 2014). He has experience in computer science, with emphasis on computer vision, machine learning and artificial intelligence.