PRICAI 2023 - Tutorials

pricai2023@gmail.com

Presenting Organizations

Tutorial #1: Large Language Models and RDF-based Knowledge Graphs: Bridging the Gap (LLM4KG)

Overview:

In this engaging tutorial, we delve into the fusion of Large Language Models (LLMs) and RDF-based Knowledge Graphs (KGs). You’ll be introduced to Langchain, an open source library designed to enhance LLM interactions.

Understanding LLMs and KGs: We start with a basic overview of LLMs and KGs. We’ll discuss their potential roles in the field of Artificial
Intelligence (AI) and Machine Learning (ML).
The Magic of LLMs and KGs Together: We’ll present LLMs and KGs as the two halves of a powerful brain. The LLM is the ‘quick-thinker’,
providing instant responses, while the KG is the ‘deep thinker’, working with complex data and relations. This concept is akin to the ‘system 1’ and ‘system 2’ idea from the book “Thinking, Fast and Slow”.
Making Life Easier with LLMs: We’ll explore how LLMs can simplify SPARQL queries, turning difficult queries into simple language. You’ll see an example of this when we feed an LLM with RDF data and it creates a SPARQL query.
Working with Langchain: You’ll see our work in action with Langchain. We’ll demonstrate this powerful open-source library’s potential in handling and interacting with LLMs.
Practical Experience - Creating SPARQL Queries with LLMs: Here comes the fun part. We’ll use LLMs to craft SPARQL queries. You’ll
witness the magic of transforming ordinary questions into proper SPARQL queries.
Imagining the Future: We’ll then look ahead, brainstorming improvements for open-source libraries. We encourage audience participation - your ideas could shape the future of these tools.
Venturing into Non-RDF to RDF Conversion: Lastly, we’ll touch on the potential of converting non-RDF data into RDF using LLMs. This
segment aims to spark conversation and ideas on how to build interfaces to make this transition easier.

Join us on this fascinating journey to simplify the complex world of RDF-based Knowledge Graphs using Large Language Models. You’ll gain hands-on experience and contribute to the future of current and future open-source libraries.

Outcomes:

Fundamental Understanding: Participants will acquire a basic understanding of how Large Language Models (LLMs) and RDF-based Knowledge Graphs (KGs) complement each other.
In-Depth Knowledge: The tutorial will provide in-depth knowledge about the interaction and synergy between LLMs and KGs, thus offering a new perspective on data processing and analysis.
Practical Skills: Attendees will gain hands-on experience in making SPARQL queries simpler using LLMs, enhancing their practical skills and proficiency in handling complex data queries.
Experience with Langchain: Participants will get exposure to Langchain, an open-source library, learning about its capabilities and
how to interact with it.
Future Enhancements: Attendees will contribute their ideas for potential improvements and new features for open-source libraries, giving them an opportunity to influence the future direction of these tools.
Data Transformation Insights: The tutorial will also offer insights into the potential of converting non-RDF data into RDF using LLMs,
expanding the scope of participants’ knowledge and sparking innovative ideas.

By the end of the tutorial, participants should walk away with a broader understanding and practical know-how in leveraging the power of LLMs and KGs, along with a sneak-peek into the future of data query simplification and transformation.

Prerequisites:

A general understanding of Knowledge Graphs and their potential use cases is recommended for participants. Although familiarity with RDF and SPARQL provides a beneficial background for the discussions and demonstrations, it is not a mandatory requirement.

Presenter: Adrian Gschwend

Adrian-Gschwend

Adrian Gschwend, a leading authority in knowledge graphs, enterprise knowledge & governance, and data modeling, is co-founder and CEO of Zazuko GmbH, the Linked Data company in Switzerland known for its open-source-first approach. With a decade of experience in Linked Data, he has become a specialist in ontologies, schemas, and vocabularies, driving the design and implementation of complex data models.

Under his leadership, Zazuko developed the Swiss Government’s Linked Data Service (LINDAS) Platform. His expertise extends to developing an Enterprise Knowledge and Governance platform based on RDF & Linked Data. Adrian contributes to the broader technology community as the co-chair of the W3C RDF Star working group, alongside Amazon AWS’s Ora Lassila.

Bringing together his extensive technical knowledge, leadership skills, and passion for innovation, Adrian Gschwend continues to push the boundaries of Linked Data and knowledge management to bring advanced solutions to enterprises worldwide.

Download Flyer

Tutorial #2: Interpretable AI/ML Models for High-stake tasks with Human-in-the- loop (IMLH)

Overview:

In high-risk tasks the consequences of incorrect decisions or predictions can result in significant harm to individuals or society including life-threatening consequences. Examples of domains with high-stakes tasks are medical diagnosis (cancer, heart diseases), financial decision-making (credit risk, fraud detection), autonomous vehicles (traffic prediction and pedestrian detection), criminal justice (predictive policing and sentencing recommendations), disaster response planning (identify areas at risk of disaster – hurricanes, floods, and earthquakes), search and rescue operations (predictions about potential search areas), and others.
Explanation must be exact (accurately describe the model’s inner workings and decision-making process) and convincing to the user. If one of these properties is lacking then it is only a quasi-explanation not serving the explanation goal. Many current popular AI/ML explanation methods are quasi-explanations. This tutorial will present current methods with their benefits and deficiencies as well as ways to overcome deficiencies with focus on high-stakes AI/ML tasks.
Explainable AI/ML is much more than allowing humans to understand why certain decisions are made, but also allowing to match this understanding with users’ domain knowledge. It requires a human in the loop, which includes checking consistency of the ML model and the domain knowledge. It can lead to reliable and trustworthy models. This tutorial will (1) present major topics of research in interpretable AI/ML models for high-stake tasks with human in the loop, (2) motivate and explain topics of emerging importance in this area to the PRICAI community. The tutorial materials will be available to the PRICAI participants online. The topics listed above will be covered in the tutorial, with a significant portion devoted to visual knowledge discovery methods with software illustration.

Outcomes:

The audience will be exposed to the topics of emerging importance in interpretable AI/ML for high-stake tasks:

Methodology of explaining black-box models. It is often difficult to understand how these models make decisions, which is critical for high-stake tasks.
Computational methods to explain AI/ML models. Multiple existing methods can help to identify most important features of a model but unfortunately can mislead in this too including popular methods like LIME and SHAP.
Human-in-the-loop: It is a mandatory part of the explainable AI/ML, which is heavily underdeveloped. It is needed to ensure that model’s decisions are explained and understood by humans to build trust in the model.
Visual knowledge discovery (VKD) is emerging as a major approach for implementing human-in-the-loop, where the human expert can provide valuable insights and feedback in the visualization space to build better models and prevent catastrophic errors.

Prerequisites:

The target audience for the tutorial are AI/ML researchers, students, and practitioners with basic knowledge of Machine Learning methodology and methods including methods of interpretable Machine Learning and Dimension Reduction.

Presenter: Prof. Boris Kovalerchuk

Prof. Boris Kovalerchuk

Dr. Boris Kovalerchuk is a professor of Computer Science at Central Washington University, USA. His publications include four books published by Springer: "Data Mining in Finance" (2000), "Visual and Spatial Analysis" (2005), "Visual Knowledge Discovery and Machine Learning" (2018), and “Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery” (2022), chapters in the Data Mining/Machine learning Handbooks, and over 200 other publications. His research and teaching interests are in AI, machine learning, visual analytics, visualization, uncertainty modeling, image and signal processing, and data fusion. Dr. Kovalerchuk has been a principal investigator of research projects in these areas, supported by the US Government agencies. He served as a member of expert panels at the international conferences, and panels organized by the US Government bodies. Prof. Kovalerchuk regularly teach classes on AI, Machine Learning, Data Mining, Information and Data Visualization, Visual Knowledge Discovery at Central Washington University. He also has been teaching these topics at several other Universities in the US and abroad. Dr. Kovalerchuk delivered relevant tutorials at IJCNN 2017, HCII 2018, KDD 2019, ODSC West 2019; WSDM 2020, IJCAI/PRICAI 2021, IV 2023.

Download Flyer

Tutorial #3: Ontology Based Data Access and Data Independence (OBDADI)

Overview:

Among the most commonly cited features of the ontology based data access (OBDA) and ontology mediated querying (OMQ) approaches to accessing data sources are their ability to use a high-level user-friendly interface to a conceptual understanding of the data (aka ontologies), while still utilizing low-level but efficient ways of representing the data in a computer store. The aim of this tutorial is to compare and contrast this OBDA based approach with approaches centred around the concept of data independence that has been under development in the area of database systems since the early 1970s. The tutorial focuses on the common lessons shared by all approaches, and on how each can benefit from lessons learned from the other.

Accessing information utilizing high-level data models or ontologies has been a long-standing objective of research communities in several areas. In work based on knowledge representation in artificial intelligence (AI), this objective commonly falls under the heading of OBDA and
OMQ, and has fostered the development of approaches using query rewriting or using variants of the so-called combined approach. However, the underlying idea of separating an ontological view of how information must be understood by users from a physical view of the layout of data in data structures---called data independence---has been the focus of work in the area of information systems for more than fifty years.

This tutorial explores how the original idea of data independence evolved and ultimately culminated in logic-based approaches to information management by systems that have enabled high-level ontological views of information entirely devoid of any low-level physical views of concrete data layout. An integral part of the tutorial is to explore the relationship between such high-level ontologies that users see and an understanding of the physical representation of such information in computer systems that is necessary to attain acceptable performance. The tutorial will address the latter by showing how ontologies derived by ontology design in AI can be used in a way that achieves an understanding of physical encoding of information sufficiently fine grained to ensure the performance of code ultimately executed to satisfy users' information requests can be competitive with solutions hand-written in low-level programming languages such as C.

Outcomes:

The topics covered in the tutorial are of interest to a wide range of AI researchers and to members of the general public with an interest in the representation and manipulation of information and knowledge. In particular, the tutorial targets the following groups:

(1) Undergraduate and graduate students and junior researchers: the tutorial introduces this group to state-of-the-art approaches to addressing issues connected with representation, storage, and manipulation of information and to modern techniques that address these issues;
(2) Researchers in the area of knowledge representation and other areas of AI: the tutorial provides bridges to many areas of AI where large data sets are used, ranging from approaches to knowledge representation and, in particular, implementation of such systems, to managing information for semantic WEB systems;
(3) Industry practitioners and developers: the tutorial provides ideas on how development of software systems, in particular in the critical phase of conceptual modelling and its mapping to physical computer storage, can be improved and what tools are available to aid this goal;
(4) Members of the general public, with an interest in logical underpinnings of logic-based information management and in technologies based on these ideas.

Prerequisites:

The tutorial assumes the audience is familiar with the basics of first order logic and of conceptual modelling formalisms (such as ER or UML) at the introductory university course level. No knowledge of particular ontology/KR languages such as Description Logics and other formalisms is assumed by the tutorial.

Presenter #1: Prof Dr. David Toman

Prof Dr. David Toman

Dr. David Toman is a professor of Computer Science at the University of Waterloo, Canada. He has published and presented results in the area of knowledge representation over the last 25 years at premier AI conferences. He received two Ray Reiter Prizes at KR 2010 and at KR 2016, the later for work related to the use of referring expressions in knowledge representation (jointly with Grant Weddell and Alex Borgida). This work was later extended to the area of conceptual modelling and was awarded the 2018 Bob Wielinga Best Paper Award for furthering the use of referring expressions in conceptual modelling. Dr. Toman has also given numerous tutorials in the area of temporal representation and reasoning and temporal databases (that has led to an invited chapter in the Handbook of Temporal Reasoning in Artificial Intelligence), on identification issues in knowledge representation systems, and on logic-based approaches to query compilation and optimization, all at
premier AI conferences such as IJCAI, ECAI, and KR.

Presenter #2: Dr. Grant Weddell

Dr. Grant Weddell

Dr. Grant Weddell has been a member of the faculty in the Cheriton School of Computer Science at the University of Waterloo in Canada for more than thirty-five years. He is a member of the Data Systems Group and works primarily in the area of structured databases on topics relating to their physical, logical and conceptual design. His current work is on extensions to SQL needed for structured data
integration for data sources that conform to a variety of data models such as low-level file systems, graph databases, NoSQL systems, JSON documents, and so on, with a focus on reference issues, on query evaluation in both open and closed world settings, in particular on view-based query rewriting, and more generally on logic in computer science. With Professor David Toman and others, he has also developed the FunDL family of description logics for capturing and reasoning about database schemata based on object-relational notions and abstractions, in particular about logical consequence relating to common underlying varieties of integrity constraints.

Tutorial's detailed outline and slides: https://cs.uwaterloo.ca/~david/pricai23/

Download Flyer

Tutorial #4: Instance Space Analysis for Rigorous and Insightful Algorithm Testing (ISA)

Overview:

This hands-on tutorial introduces Instance Space Analysis (ISA), a methodology for experimental evaluation of algorithms, by making use of the online tools available at the Melbourne Algorithm Test Instance Library with Data Analytics (MATILDA - https://matilda.unimelb.edu.au). ISA was developed as an alternative to the standard algorithm testing practice, that is, to report performance on average across a set of well-studied benchmark instances. A drawback of standard practice is its critical dependency on the choice of a benchmark set. Ideally, such a set should be unbiased, challenging and contain a mix of synthetically generated and real-world-like instances with diverse structural properties. Without this diversity, our ability to generalise
conclusions about performance is limited. Moreover, reporting performance on average highlights the strengths while masking the weaknesses.

ISA opens the opportunity to explore the algorithm's strengths and weaknesses across the benchmark set, providing objective measures of relative power while giving indications of the set's diversity. To do so, ISA constructs a bi-dimensional space where each instance is represented as a point and defines a boundary where instances are likely to be located. Moreover, ISA identifies an "algorithm footprint," or the region of the space where the "good" performance of an algorithm is expected, based on empirical evidence. Combining performance information with measured characteristics of each instance, we can construct hypotheses on an algorithm's behaviour. Finally, by exploring the space, we identify areas where new instances are needed, allowing an algorithm to be comprehensively "stress-tested". MATILDA also provides a collection of meta-data and results for several well-studied machine learning and optimisation problems.
The tutorial runs for half-day divided into three sections of 45 minutes each, and a Q&A session in the final 15 minutes. In the first section, we will introduce the challenges in the experimental testing of algorithms, followed by a description of the Instance Space Analysis methodology. In the second section, a live demonstration of the tools available at MATILDA will be given, illustrated with some of the results already available in the library case studies. Finally, we will carry out an interactive exercise where the participants will be able to use the tools with a prepared dataset and will be able to download their results.

Outcomes:

The tutorial is aimed at students, researchers and practitioners in the areas of Machine Learning and Optimization, who wish to improve their ability to analyse algorithm evaluation studies to draw insightful conclusions about the strengths and weaknesses of algorithms, and the adequacy of benchmark test suites. By the end of the tutorial, participants will be in a position to create their own instance space analysis for studying the strengths and weaknesses of their algorithm comparison studies.

Prerequisites:

There is no explicit pre-requisite knowledge, but a general understanding of experimental algorithmics is bene cial. A laptop with a modern browser is required for using MATILDA's online tools, available at https://matilda.unimelb.edu.au/matilda/login. To log into them, an account must be created at https://matilda.unimelb.edu.au/matilda/newuser. For more advanced users, the MATLAB code used by MATILDA can be downloaded at https://github.com/andremun/InstanceSpace.

Presenter #1: Kate Smith-Miles

Kate Smith-Miles

Kate Smith-Miles is a Professor of Applied Mathematics in the School of Mathematics and Statistics at The University of Melbourne, and the Director of the ARC Centre in Optimisation Technologies, Integrated Methodologies and Applications (OPTIMA). Before joining The Uni-
versity of Melbourne in September 2017, she was Professor of Applied Mathematics at Monash University and Head of the School of Mathematical Sciences (2009-2014). Previous roles include President of the Australian Mathematical Society (2016-2018), and membership of the Australian Research Council College of Experts (2017-2019). Kate was elected Fellow of the Institute of Engineers Australia (FIEAust) in 2006, and Fellow of the Australian Mathematical Society (FAustMS) in 2008.

Kate obtained a BSc(Hons) in Mathematics and a PhD in Electrical Engineering from The University of Melbourne. She has published two books on neural networks and data mining; and around 300 refereed journal and international conference papers in neural networks, optimisation, data mining, and various applied mathematics topics. She has supervised over 30 PhD students to completion and has received over AUD$20 million in competitive grants.

Presenter #2: Mario Andres Munoz

Mario Andres Munoz

Mario Andres Munoz is a Research Fellow at the ARC Centre in Optimisation Technologies, Integrated Methodologies and Applications (OPTIMA), and the School of Computer and Information Systems, the University of Melbourne. Before joining the University of Melbourne, he was a Research Fellow in Applied Mathematics at Monash University (2014-2017). Mario Andres obtained a BEng (2005) and an MEng (2008) in Electronics from Universidad del Valle, Colombia, and a PhD (2014) in Engineering from the University of Melbourne. He has published over 50 refereed journal and conference papers in optimisation, data mining, and other interdisciplinary topics. He has supervised 2 PhD students to completion and currently supervises 7 PhD students. He has received over AUD$1 million in research funding. He developed and maintains the MATLAB code that drives MATILDA.

Download Flyer

Tutorial #5: Computational Social Choice Competition (COMPSOC)

Overview:

Computational social choice (COMSOC) is an emergent and multidisciplinary field that combines computer science and social choice principles for aggregating collective decisions. This thriving field of research has numerous applications in resource allocation, fair division, election systems, and distributed ledgers. One of the most well-studied problems in COMSOC focuses on designing voting mechanisms for selecting the winners in an election. Voting is essential to democracy as it allows individuals to voice their choices and elect their representatives. Despite its wide range of applications, voting procedures are susceptible to manipulation and behaviors compromising fairness in outcomes. Paradoxes and impossibility results are commonly encountered when implementing voting rules in electoral systems. Recent research in the field aims to tackle these challenges by exploring procedures that select multiple winners or borrowing innovative techniques from machine learning. Computer simulations, agent-based modeling, generative models, and generative agents have emerged as practical ways to address some challenges encountered in COMSOC research. In line with this vision, this tutorial will build on the progress in agent-based development and machine learning to teach you practical ways to benchmark voting rules in a competitive setting. The tutorial covers the design, implementation, deployment, and analysis of voting rules. The tutorial will provide valuable insights into the performances of voting mechanisms defined over parametrically generated voting problems, alternatives, and voters. The tutorial will balance theory and practice. The theoretical parts will introduce the audience to essential concepts in social choice theory. The practical parts of the tutorial will rely on hands-on activities using a dedicated Python SDK. The tutorial is relevant to audiences interested in group decision-making, social choice, and collective intelligence.

Outcomes:

The attendants will acquire the basics of computational social choice. They will additionally be offered practical ways to benchmark social choice mechanisms using techniques borrowed from agent research and machine learning. Researchers in social choice theory and computational social choice will be able to use the Python SDK to test their theories and hypotheses with realistic voting data. The tutorial will also introduce the 1st Computational Social Choice Competition(COMPSOC 2023), its goals and prospects, and prepare the attendants for COMPSOC 2024.

Prerequisites:

The attendants should have basic knowledge of decision theory, utility theory, and social choice. The technical parts of the tutorial require some understanding of machine learning approaches, such as generative models. The practical parts require knowledge of Python programming, algorithmic complexity, and agent-based development.

Presenter: A/Prof Rafik Hadfi

Rafik HADFI

Rafik Hadfi is an Associate Professor in the Department of Social Informatics at Kyoto University in Japan. He received his Ph.D. from the Nagoya Institute of Technology in 2015 and worked in Japan and Australia before joining Kyoto University in 2020. Rafik's research interests lie in preference aggregation, agent-based negotiation, multiagent simulation, consensus building, conversational AI, and game theory. He previously led tutorials at top AI conferences, including AAMAS 2022, PRIMA 2022, and PRICAI 2021. Rafik is a reviewer for Group Decision and Negotiation, Artificial Intelligence Review, Neural Computation, Advanced Computational Intelligence and Intelligent Informatics, and Networked and Distributed Computing. Rafi k has been the publication chair, workshop chair, tutorial chair, volunteer chair, program chair, and web chair for international AI conferences such as IJCAI, PRICAI, PRIMA, and IEEE ICA. Rafi k received the Gregory Kersten GDN JournalBest Paper Award (2023), the Supply Chain Management League Competition Award at the 13th International Joint Conference on Artificial Intelligence (2021), the annual conference award from Japan's Society for Artificial Intelligence (2020), the IBM Award of Scientific Excellence (2020), the Best Paper Award from the Information Processing Society of Japan (2016), the IEEE Young researcher Award (2014), and the AAAI Student Scholarship Award (2014).

Download Flyer

Tutorial #6: Q2AI: A Quick Course to Quick AI

Overview:

State-of-the-art AI models are getting larger in size. Thus, they require high computational costs to train and deploy. Most companies, especially smaller ones and startups will find difficulties to justify the rising cost of leveraging AI technologies for their business. Hence, AI developers need to have some knowledge of how to train and deploy efficient models. This tutorial aims to impart
such knowledge to the audience.
In this tutorial, we will focus on the principles and hands-on implementation of Knowledge Distillation (KD) and Parameter Efficient Fine Tuning (PEFT), two of the most popular techniques to build efficient AI models. On top of that, we will also explore post-training quantization, pruning, and ONNX runtime to make our models run even quicker.
The hands-on implementation will focus on applying KD and PEFT to Large Language Models (LLMs) such as BERT, XLM, and GPT. Nevertheless, the same principle can be applied to other contexts such as in Computer Vision or Speech Technologies, and external references to such would be provided.

Outcomes:

Participants learn the motivation of low-cost AIs.
Participants learn the relevant state-of-the-art approaches of low-cost AIs.
Participants learn the concept of KD, PEFT, quantization, pruning, and ONNX runtime.
Participants experience building a fully functioning KD+PEFT training pipeline in Pytorch.
Participants learn how to apply quantization, pruning, and converting PyTorch models to
ONNX format.
Participants learn how to evaluate the resulting light model against the initial large model.

Prerequisites:

Some basic ideas on deep learning (DL) and how a typical AI trained using DL works. Some hands-on experience in Python and Pytorch code would be crucial. Nevertheless, we welcome participants with minimal coding experience as the conceptual explanation would still be
beneficial for them. Code solutions would also be provided at the tutorial so each participant can revisit them later at their own convenience.

Presenter #1: Haryo Akbarianto Wibowo

Haryo Akbarianto Wibowo

Haryo is a Ph.D. student at MBZUAI, specializing in Natural Language Processing (NLP). His interests include researching low-resource NLP, such as exploring model efficiency (e.g., knowledge distillation), collaborating on creating low-resource data (e.g., NusaCrowd and
informal-formal Indonesian style transfer), and probing language models. He is also an experienced AI practitioner in the industry. Previously, he worked at Kata.ai, where he conducted some of the above-mentioned research and published research papers as a result. Furthermore,
he has experience in computer vision and speech.

Presenter #2: Alham Fikri Aji

Alham Fikri Aji

Aji explores efficient NLP through model compression and distillation, and NLP for under-resourced languages. This involves dataset curation/construction, data-efficient learning/adaptation, zero-shot approaches, or building multilingual language models. He is currently active in Indonesian and South-East Asian NLP research communities. Aji is Assistant Professor at MBZUAI. Prior to joining MBZUAI, Aji was an applied research scientist at Amazon. He was a postdoctoral fellow at the Institute for Language, Cognition, and Computation at the University of Edinburgh. During his postdoctoral and PhD, he has contributed to efficient NMT-related projects, such as Marian: a fast NMT framework and browser-based translation without using the cloud.

Presenter #3: Rendi Chevi

Rendi Chevi

Rendi is a Research Assistant at MBZUAI, with a primary interest in generative modeling for speech and NLP tasks. Previously, Rendi worked as an AI Research Scientist at Kata.ai, where he’d worked on various low-resource and efficient NLP and speech models, such as Nix-TTS; a
lightweight GAN-based text-to-speech model suitable for real-time inference on low-compute devices.

Presenter #4: Radityo (Ridho) Eko Prasojo

Radityo (Ridho) Eko Prasojo

Ridho leads a team of industrial AI researchers and developers in Indonesia; previously at Kata.ai, and now at Pitik Digital Indonesia. At Kata.ai, Ridho oversaw the development of efficient AI models for chatbots by distilling LLMs and high-fidelity text-to-speech models. At Pitik, he leads the development of a scalable computer vision AI service for broiler farms through pre-distilled ONNX models running on the cloud.

Download Flyer

Tutorial #7: Reinforcement Learning for Digital Business (RL4DB)

Overview:

This tutorial applies reinforcement learning (RL) to digital business challenges in online advertising and inventory management. Participants will learn critical concepts such as Markov decision processes and exploration-exploitation trade-offs. RL offers a robust framework for decision-making in dynamic and uncertain environments, enabling businesses to adapt and evolve strategies based on real-time feedback.
In online advertising, businesses must choose the most effective advertisements to display to users, while inventory management demands optimizing stock levels for maximum profit and cost reduction. RL provides a principled approach to tackling these challenges, empowering businesses to address complex digital problems effectively.
The tutorial covers important RL concepts, including constructing Markov decision processes to represent decision-making sequences accurately. Participants will learn how to balance exploration and exploitation, exploring new strategies while exploiting known successful ones. Reward shaping is a crucial aspect of RL, and participants will learn to design appropriate reward functions aligned with specific business goals. Effective reward shaping guides the RL process toward desired outcomes, leading to
improved decision-making. The tutorial includes hands-on code examples and demonstrations using popular RL libraries and frameworks. Participants gain practical experience by implementing RL algorithms in digital business scenarios. Participants will have the tools and insights to apply RL effectively in their digital businesses by the tutorial's end. They will identify suitable scenarios for RL application, design state, action spaces, rewards, and select appropriate RL models. Leveraging RL, businesses can enhance efficiency, increase profitability, and gain a competitive advantage in the ever-changing digital landscape.

Outcomes:

The outcome of this tutorials is participants equipped with practical knowledge and skills to leverage reinforcement learning (RL) for digital business optimization. Through an in-depth exploration of RL concepts, including Markov decision processes and exploration-exploitation trade-offs, participants deeply understand how RL can address complex challenges in online advertising and inventory management.
By the end of the tutorials, attendees will be proficient in designing state and action spaces, shaping rewards, and selecting appropriate RL models to optimize various aspects of their digital businesses. Incorporating hands-on code examples and demonstrations using popular RL libraries and frameworks ensures that participants gain valuable experience implementing RL algorithms for real-world applications.
Equipped with the knowledge and skills gained from the tutorials, participants can boost efficiency, increase profitability, and gain a competitive advantage in their specific industries. The tutorials aim to produce well-prepared digital business professionals to adopt RL as a robust decision-making tool, positioning themselves at the forefront of innovation in the constantly evolving digital domain.

Prerequisites:

A foundational understanding of machine learning and deep learning is required, encompassing both supervised and unsupervised learning. Additionally, familiarity with Python is necessary to participate in the tutorials. Prior experience in reinforcement learning is optional, as the tutorials will start from the basics, providing a comprehensive introduction to RL concepts, algorithms, and implementations.
Participants will be equipped with the necessary knowledge and skills to leverage RL effectively for optimizing their digital businesses, leading to improved efficiency, profitability, and a competitive advantage in the dynamic digital landscape.

Presenter #1: Edwin Simjaya

Edwin Simjaya

Edwin Simjaya is an AI expert and currently serves as the Head of AI & Software Center. With over 15 years of experience in software engineering and notable achievements in the field, Edwin has established himself as a professional in AI. His academic journey includes post-graduate studies in Mathematics at the University of Indonesia and an undergraduate degree in Computer Science from the University of Pelita Harapan.
Specializing in Natural Language Processing (NLP) and Reinforcement Learning, Edwin's expertise has led to groundbreaking contributions. Notably, he led the implementation of an AI Augmented Nutrigenomic Algorithm utilizing Large Language Models (LLM), revolutionizing the field. Additionally, Edwin manages Kalbe Digital University content and implementation and has delivered key projects for internal Kalbe. His excellence extends beyond corporate endeavors, as he actively shares insights as a keynote speaker at various corporate events.
Edwin's research contributions are equally impressive, evident in his paper on "Domain Adaptation for Nutrigenomic Knowledge Base System". His dedication to education is reflected in his role as an Assistant Lecturer at the University of Indonesia and Telkom University.

Presenter #2: Adhi Setiawan

Adhi Setiawan

Adhi Setiawan is an Artificial Intelligence Engineer at Kalbe Digital Lab, specializing in Reinforcement Learning and Computer Vision. He holds a Bachelor of Computer Science from the University of Brawijaya. Adhi's research contributions have significantly impacted the field of AI. He has authored research papers such as "Large scale pest classification using Efficient Convolutional Neural Network with Augmentation and Regularizers" and "Deteksi Covid-19 pada Citra Sinar-X Dada Menggunakan Pre-Training Deep Autoencoder", both published in reputable journals. Beyond his research pursuits, Adhi is actively involved in teaching and mentoring. He served as a Teaching Assistant at the University of Brawijaya in 2020 and has been an advisor for various Artificial Intelligence projects within Kalbe's internal business unit.
Adhi's dedication to the AI community is evident through his contributions to the Jakarta Artificial Intelligence Research, where he actively participates and shares his expertise. With a strong foundation in computer science and a passion for advancing AI technologies, Adhi Setiawan continues to explore the artificial intelligence field.

Download Flyer

Tutorial #8: Applications of Generative AI tools in Higher Education (AIEDU)

Overview:

Although generative AI is taking the world by storm and the speed of adoption is unprecedented, OpenAI, its founder, has released very little information about the design and functional power of ChatGPT (one of the most common "shells" of generative AI software). As such, nearly everyone, including computer scientists are treating such software as a blackbox and learn about its functional capabilities only via trial and error. This is surely not good enough as our own wishful critical thinking has somewhat been compromised. Redressing this shortfall, this tutorial will, via gathering valuable information from multiple authoritative sources, further discuss the inherent characteristics of generative software and project their implications on core operations in higher education including learning, teaching, assessments, and research. Split in 2 half sessions over a total of 3 hours, this tutorial will cover mini-lectures, recorded interviews, software demonstrations, case studies, and hands-on experiments by the attendees followed by experience sharing and reflections. On teaching and learning, we shall discuss the potential applications of generative AI software and their limitations, how to incorporate such tools into the learning process and at the same time enhances learners' critical thinking skills. Changes in assessments will also be included with cases to share. On the research side, a framework for categorising the various types of AI tools that support research will be outlined. Selective tools will be demonstrated. Advices on how to start using such tools to support various phases of the research process will be outlined as well as summarising the discussions on current policies and guidelines at various authoritative sources.

Outcomes:

This tutorial comprehensively explores the application and impact of generative AI tools on tertiary education covering learning, teaching, assessments and research. By knowing more about the generic characteristics of generative software, the process and cost in development them, and the role of the training data, attendees, irrespective of your background and experience, can apply higher order critical thinking to assess the appropriateness, relevancy, implications and qualifications in designing applications that involve the use of such tools. This is not a technical session; rather all explanations are in layman terms.

Prerequisites:

Ideally speaking, the attendee should be a typical academic or a research student aiming to embark on an academic career. A basic understanding of the various stages in the research process is expected. Participants should also ensure they have fast and unobstructed access to the internet throughout the duration of the tutorial.

Presenter: Prof Eric Tsui

Eric-Tsui

Eric Tsui is the Associate Director of the Behaviour and Knowledge Engineering (BAKE) Research Centre as well as a Senior Project Officer at the Educational Research Centre at The Hong Kong Polytechnic University. In 2015-2023, he served as the Regional Editor of the Journal of Knowledge Management and has led and delivered a Master Knowledge Management program for over 15 years. Eric has also championed many technology-enhanced teaching and learning projects and is a crusader of blended learning at the university. His research interests include Knowledge Management technologies, blended learning, cloud services, and collaborations. Eric is the leader of a Professional Certificate program that consists of two Massive Open Online Courses (MOOCs) on edX that cover the topics of Knowledge Management, Big Data, and Industry 4.0. He holds B.Sc. (Hons.), PhD, and MBA qualifications. A recipient of many Knowledge Management and E-Learning international awards, including the Knowledge Management Award for Excellence in 2021, Professor Tsui was twice listed as an exemplary/outstanding academic in PolyU Annual Reports in the last 7 years.

Download Flyer

Tutorial #9: Unleashing the Power of Large Language Models: A Comprehensive Tutorial on Training LLM using Alpaca + LoRA (AlpaRa)

Overview:

This tutorial session explores the exciting world of training large language models (LLMs) by focusing on Alpaca 7B and integrating Low-Rank Adaptation (LoRA) techniques to enhance performance. LLMs have emerged as powerful tools in natural language processing, facilitating various applications like language understanding, sentiment analysis, and text generation. Our goal is to equip attendees with the expertise to create domain specific LLMs in text generation using Alpaca + LoRA, tailoring them to excel in specific areas of interest.
The foundation of our tutorial begins with explaining a short history of natural language processing from basic NLP models such as RNN/LSTM until the current state of the art of Transformers and its variants. After that, the tutorial will explain about pre-training of Alpaca 7B, a vast LLM initially exposed to diverse internet data, capturing the nuances of general knowledge and language patterns. We will then delve into the fine-tuning process, which involves training the model on domain-specific datasets. This fine-tuning procedure is instrumental in shaping LLMs to grasp domain-specific concepts and patterns, rendering them more effective and reliable in targeted domains.
Beyond fine-tuning, we introduce participants to innovative Low-Rank Adaptation (LoRA) techniques, a breakthrough model compression and optimization method. LoRA enables us to compress and reduce the size of LLMs without incurring significant performance loss. As a result, we obtain more lightweight and computationally efficient models, which are crucial for deploying LLMs on resource-constrained consumer devices.
The culmination of combining Alpaca 7B with fine-tuning and LoRA strategies yields LLMs that offer advanced natural language processing capabilities. These specialized models can process and comprehend complex language patterns within specific domains, making them invaluable for researchers, practitioners, and enthusiasts.
The tutorial provides hands-on experience in training LLMs using Alpaca 7B and implementing LoRA techniques. We guide participants through the entire training process. Engaging Q&A sessions and open discussions offer valuable opportunities for attendees to share insights and experiences.
Participants will be equipped with the skills to harness the power of LLMs for their specific domains and achieve superior performance compared to state-of-the-art results in their research and applications. Join us in this transformative tutorial to unlock the vast potential of domain-specific LLMs and pave the way for cutting-edge advancements in AI-driven language understanding and processing domain-specific language models.

Outcomes:

By the end of this tutorial, participants will achieve tangible outcomes that will enhance their understanding and expertise in training domain-specific language models (LLMs) using Alpaca 7B and Low-Rank Adaptation (LoRA) techniques.

Comprehensive Understanding of LLMs: Participants will gain a thorough comprehension of LLMs, including their pre-training process, capabilities, and applications in text generation task.
Hands-on Training Experience: Attendees will acquire practical hands-on experience training LLMs using Alpaca 7B and implementing LoRA techniques. They will be guided through the entire training process.
Domain-specific LLM Creation: Participants will understand the fine-tuning process to adapt the models to grasp domain-specific concepts and patterns.
LoRA Technique Implementation: Participants will learn how to implement innovative Low-Rank Adaptation (LoRA) techniques for model compression and optimization. They will understand how to achieve more lightweight and efficient LLMs without compromising performance.
Active Engagement and Networking: The tutorial fosters a collaborative environment, encouraging active participation through Q&A sessions and open discussions. Attendees will have valuable engagement opportunities to share insights, experiences, and challenges with presenters and peers.

Prerequisites:

To make the most of this tutorial, participants should have a foundational understanding of Python programming and a grasp of key concepts in Machine Learning, Deep Learning, and Natural Language Processing (NLP). Familiarity with basic Large Language Models (LLMs) will be beneficial but optional. This tutorial assumes moderate expertise in the mentioned areas to ensure attendees can actively engage in the hands-on training and discussions. Prior experience with LLMs will enable participants to delve deeper into domain-specific applications and LoRA techniques, enhancing their overall learning experience during the tutorial.

Presenter #1: Edwin Simjaya

Edwin Simjaya

Edwin Simjaya is an AI expert and currently serves as the Head of AI & Software Center. With over 15 years of experience in software engineering and notable achievements in the field, Edwin has established himself as a professional in AI. His academic journey includes post-graduate studies in Mathematics at the University of Indonesia and an undergraduate degree in Computer Science from the University of Pelita Harapan.
Specializing in Natural Language Processing (NLP) and Reinforcement Learning, Edwin's expertise has led to groundbreaking contributions. Notably, he led the implementation of an AI Augmented Nutrigenomic Algorithm utilizing Large Language Models (LLM), revolutionizing the field. Additionally, Edwin manages Kalbe Digital University content and implementation and has delivered key projects for internal Kalbe. His excellence extends beyond corporate endeavors, as he actively shares insights as a keynote speaker at various corporate events.
Edwin's research contributions are equally impressive, evident in his paper on "Domain Adaptation for Nutrigenomic Knowledge Base System". His dedication to education is reflected in his role as an Assistant Lecturer at the University of Indonesia and Telkom University.

Presenter #2: Adhi Setiawan

Adhi Setiawan

Adhi Setiawan is an Artificial Intelligence Engineer at Kalbe Digital Lab, specializing in Reinforcement Learning and Computer Vision. He holds a Bachelor of Computer Science from the University of Brawijaya.
Adhi's research contributions have significantly impacted the field of AI. He has authored research papers such as "Large scale pest classification using Efficient Convolutional Neural Network with Augmentation and Regularizers" and "Deteksi Covid-19 pada Citra Sinar-X Dada Menggunakan Pre-Training Deep Autoencoder", both published in reputable journals.
Beyond his research pursuits, Adhi is actively involved in teaching and mentoring. He served as a Teaching Assistant at the University of Brawijaya in 2020 and has been an advisor for various Artificial Intelligence projects within Kalbe's internal business unit.
Adhi's dedication to the AI community is evident through his contributions to the Jakarta Artificial Intelligence Research, where he actively participates and shares his expertise. With a strong foundation in computer science and a passion for advancing AI technologies, Adhi Setiawan continues to explore the artificial intelligence field.

Presenter #3: Shinta Roudlotul Hanafia

Shinta Roudlotul Hanafia

Shinta Roudlotul Hanafia is an accomplished AI Engineer with a passion for cutting-edge technologies. She holds a bachelor's degree in computer engineering from Telkom University (2022), where she honed her skills in the intersection of AI and technology.
Currently serving as an AI Engineer at Kalbe Farma, Shinta plays a role in the research and development of AI solutions for healthcare, with a primary focus on NLP applications.

Shinta is dedicated to knowledge sharing. She contributes to the Machine Learning Indonesia YouTube channel as a speaker and manager. Her enthusiasm for education and mentorship is further evident in her roles as a Machine Learning Teaching Assistant (2022-now) and a Practical Assistant of Object-Oriented Programming for Python Language (2021).
Driven by a relentless curiosity and a passion for constant growth, Shinta Roudlotul Hanafia is eager to continue expanding her expertise in Artificial Intelligence, particularly in the field of Natural Language Processing (NLP). Recognizing the transformative potential of NLP in revolutionizing how machines understand and interact with human language, she is dedicated to the advancements in this domain.

Download Flyer

Tutorial #10: Current Interfaces of Logic and AI: the case of natural language (LAI)

Overview:

Logic and AI share a long history going back to the emergence of computer science. Of the many themes in this contact, this tutorial focuses on just one of high current relevance: the role of reasoning in natural language. We will introduce the audience to ‘natural logic’ systems for fast inferencing in using natural language, then to epistemic logics highlighting the role of agents and their information in language use, and finally we move to dynamic-epistemic logics that describe scenarios in communication and planning conversation. In all three cases, we give illustrations how these topics connect to machine-learning based AI and the interface of symbolic and subsymbolic computing. We conclude with some pointers to other topics in the recently emerging contacts between logic and AI in its modern phase.

The tutorial will have a supporting webpage with further information and literature on recent contacts between logic and AI.

Outcomes:

Learn the basic 'natural logic' system modelling fast correct inferencing in natural language as opposed to the ‘slow thinking’ of mathematical proof systems.
Gain basic insights into epistemic logics and how they highlight the social role of agents and their information in language use.
Understand dynamic-epistemic logics for communication and planning. This will help participants appreciate how these logics are applied in real-world AI systems.

Prerequisites:

A background in logic, including propositional and predicate logic, will be beneficial for comprehending the content of the tutorial.
Some basic knowledge of classical and modern AI concepts and techniques.

Presenter #1: Johan van Benthem

Johan Van Benthem

Johan van Benthem is a University Professor emeritus at University of Amsterdam, Henry Waldgrave Stuart Professor at Stanford University, and Jin Yuelin Professor at Tsinghua University Beijing. He has worked in modal logic, temporal logic, logical semantics and syntax of natural language, and dynamic logics of information, computation, and agency. He was the founding director of the 'Institute for Logic, Language and Computation' (ILLC) at the University of Amsterdam, and the first Chair and First Honorary Member of the European Association for Language, Logic and Information (FoLLI). His current main interests lie at the interface of logic, computer science, cognitive science, and game theory.

Presenter #2: Fenrong Liu

Fenrong Liu

Fenrong Liu is a Changjiang Distinguished Professor at Tsinghua University, Amsterdam-China Logic Chair at the University of Amsterdam, and Co-Director of the Tsinghua - UvA Joint Research Centre for Logic. Her research areas include reasoning about preference and preference change, social network logics, causality and foundations of AI. She also maintains active interests in Chinese ancient logic. Currently she is editing a Handbook of Logical Thought in China.

Download Flyer