machine learning research papers

Machine Learning

Reports substantive results on a wide range of learning methods applied to various learning problems.
Provides robust support through empirical studies, theoretical analysis, or comparison to psychological phenomena.
Demonstrates how to apply learning methods to solve significant application problems.
Improves how machine learning research is conducted.
Prioritizes verifiable and replicable supporting evidence in all published papers.
Hendrik Blockeel

Latest issue

Volume 113, Issue 7

Latest articles

A generic approach for reproducible model distillation.

Yunzhe Zhou
Giles Hooker

Describing group evolution in temporal data using multi-faceted events

Andrea Failla
Rémy Cazabet
Salvatore Citraro

Autoreplicative random forests with applications to missing value imputation

Ekaterina Antonenko
Ander Carreño

Neural calibration of hidden inhomogeneous Markov chains: information decompression in life insurance

Mark Kiermayer
Christian Weiß

Integration of multi-modal datasets to estimate human aging

Rogério Ribeiro
Athos Moraes
Pedro G. Ferreira

Journal updates

Cfp: discovery science 2023.

Submission Deadline: March 4, 2024

Guest Editors: Rita P. Ribeiro, Albert Bifet, Ana Carolina Lorena

CfP: IJCLR Learning and reasoning

Call for papers: conformal prediction and distribution-free uncertainty quantification.

Submission Deadline: January 7th, 2024

Guest Editors: Henrik Boström, Eyke Hüllermeier, Ulf Johansson, Khuong An Nguyen, Aaditya Ramdas

Call for Papers: Special Issue on Explainable AI for Secure Applications

Submissions Open: October 15, 2024 Submission Deadline: January 15, 2025

Guest Editors: Annalisa Appice, Giuseppeina Andresini, Przemysław Biecek, Christian Wressnegger

Journal information

ACM Digital Library
Current Contents/Engineering, Computing and Technology
EI Compendex
Google Scholar
Japanese Science and Technology Agency (JST)
Mathematical Reviews
OCLC WorldCat Discovery Service
Science Citation Index Expanded (SCIE)
TD Net Discovery Service
UGC-CARE List (India)

Rights and permissions

Editorial policies

Find a journal
Publish with us
Track your research

Subscribe to the PwC Newsletter

Join the community, trending research, mindsearch: mimicking human minds elicits deep ai searcher.

internlm/mindsearch • 29 Jul 2024

Inspired by the cognitive process when humans solve these problems, we introduce MindSearch to mimic the human minds in web information seeking and integration, which can be instantiated by a simple yet effective LLM-based multi-agent framework.

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs).

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors

Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity.

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Global Structure-from-Motion Revisited

colmap/glomap • 29 Jul 2024

Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM).

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

We hope that our study can facilitate the research community and LLM vendors in promoting safer and regulated LLMs.

SGLang: Efficient Execution of Structured Language Model Programs

SGLang consists of a frontend language and a runtime.

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Virtual try-on methods based on diffusion models achieve realistic try-on effects but often replicate the backbone network as a ReferenceNet or use additional image encoders to process condition inputs, leading to high training and inference costs.

TorchRL: A data-driven decision-making library for PyTorch

PyTorch has ascended as a premier machine learning framework, yet it lacks a native and comprehensive library for decision and control tasks suitable for large development teams dealing with complex real-world data and environments.

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

To address these challenges, we propose a novel unified open-vocabulary detection method called OV-DINO, which is pre-trained on diverse large-scale datasets with language-aware selective fusion in a unified framework.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

View all journals
Explore content
About the journal
Publish with us
Sign up for alerts

Machine learning articles within Nature

News | 01 August 2024

These AI firms publish the world’s most highly cited work

US and Chinese firms dominate the list of companies that are producing the most research and patents in artificial intelligence.

Elizabeth Gibney

News & Views | 31 July 2024

Cheap light sources could make AI more energy efficient

Light-based devices can reduce the energy consumption of computers, but most rely on lasers, which are expensive to integrate with other technologies. An approach that uses LEDs instead of lasers provides a path forwards.

Kathy Lüdge
& Lina Jaurigue

News Feature | 30 July 2024

AI is complicating plagiarism. How should scientists respond?

The explosive uptake of generative artificial intelligence in writing is raising difficult questions about when use of the technology should be allowed.

Outlook | 25 July 2024

AI is vulnerable to attack. Can it ever be used safely?

The models that underpin artificial-intelligence systems such as ChatGPT can be subject to attacks that elicit harmful behaviour. Making them safe will not be easy.

Simon Makin

News Explainer | 25 July 2024

Three ways AI is changing the 2024 Olympics for athletes and fans

From training to broadcasting, artificial intelligence will have an imprint on this year’s event for the first time.

Sumeet Kulkarni

News | 25 July 2024

DeepMind hits milestone in solving maths problems — AI’s next grand challenge

AlphaProof showed its prowess on questions from this year’s Mathematical Olympiad — a step in the race to create substantial proofs with artificial intelligence.

Davide Castelvecchi

News & Views | 24 July 2024

AI produces gibberish when trained on too much AI-generated data

Generative AI models are now widely accessible, enabling everyone to create their own machine-made something. But these models can collapse if their training data sets contain too much AI-generated content.

Emily Wenger

News | 24 July 2024

AI models fed AI-generated data quickly spew nonsense

Researchers gave successive versions of a large language model information produced by previous generations of the AI — and observed rapid collapse.

News | 23 July 2024

Google AI predicts long-term climate trends and weather — in minutes

Models that are more reliable and less energy-intensive could help us to better prepare for extreme weather.

Helena Kudiabor

Technology Feature | 22 July 2024

ChatGPT for science: how to talk to your data

Companies are using artificial intelligence tools to help scientists to query their data without the need for programming skills.

Julian Nowogrodzki

Correspondence | 16 July 2024

Generative AI makes for better scientific writing — but beware the pitfalls

Zhongkai Liao
& Congcong Zhang

Technology Feature | 03 July 2024

Inside the maths that drives AI

Loss functions measure algorithmic errors in artificial-intelligence models, but there’s more than one way to do that. Here’s why the right function is so important.

Michael Brooks

Nature Podcast | 26 June 2024

Why ‘open source’ AIs could be anything but, the derailment risks of long freight trains, and breeding better wheat

We round up some recent stories from the Nature Briefing.

Nick Petrić Howe
, Benjamin Thompson
& Dan Fox

Correspondence | 25 June 2024

AI machine translation tools must be taught cultural differences too

Helene Tenzer
, Stefan Feuerriegel
& Rebecca Piekkari

News & Views | 19 June 2024

‘Fighting fire with fire’ — using LLMs to combat LLM hallucinations

The number of errors produced by an LLM can be reduced by grouping its outputs into semantically similar clusters. Remarkably, this task can be performed by a second LLM, and the method’s efficacy can be evaluated by a third.

Karin Verspoor

News | 19 June 2024

Not all ‘open source’ AI models are actually open: here’s a ranking

Many of the large language models that power chatbots claim to be open, but restrict access to code and training data.

Article 19 June 2024 | Open Access

Computational design of soluble and functional membrane protein analogues

A deep learning approach enables accurate computational design of soluble and functional analogues of membrane proteins, expanding the soluble protein fold space and facilitating new approaches to drug screening and design.

Casper A. Goverde
, Martin Pacesa
& Bruno E. Correia

News Q&A | 13 June 2024

AI and Euro 2024: VAR is shaking up football — and it’s not going away

Sports physicist Eric Goff explains how updates to the technology can help referees to make the toughest calls.

Article | 12 June 2024

A Multimodal Generative AI Copilot for Human Pathology

, Bowen Chen
& Faisal Mahmood

Editorial | 05 June 2024

Meta’s AI system is a boost to endangered languages — as long as humans aren’t forgotten

Automated approaches to translation could provide a lifeline to under-resourced languages, but only if companies engage with the people who speak them.

Career Column | 05 June 2024

Need a policy for using ChatGPT in the classroom? Try asking students

Students are the key users of AI chatbots in university settings, but their opinions are rarely solicited when crafting policies. That needs to change, says Maja Zonjić.

Maja Zonjić

News & Views | 05 June 2024

Meta’s AI translation model embraces overlooked languages

More than 7,000 languages are in use throughout the world, but popular translation tools cannot deal with most of them. A translation model that was tested on under-represented languages takes a key step towards a solution.

David I. Adelani

News | 04 June 2024

Superfast Microsoft AI is first to predict air pollution for the whole world

The model, called Aurora, also forecasts global weather for ten days — all in less than a minute.

Carissa Wong

News Feature | 04 June 2024

How cutting-edge computer chips are speeding up the AI revolution

Engineers are harnessing the powers of graphics processing units (GPUs) and more, with a bevy of tricks to meet the computational demands of artificial intelligence.

Dan Garisto

Outlook | 29 May 2024

AI assistance for planning cancer treatment

Armed with the right data, advances in machine learning could help oncologists to home in quickly on the best treatment strategies for their patients.

Michael Eisenstein

Nature Podcast | 29 May 2024

How AI could improve robotics, the cockroach’s origins, and promethium spills its secrets

Benjamin Thompson
, Elizabeth Gibney
& Flora Graham

News Explainer | 29 May 2024

Who owns your voice? Scarlett Johansson OpenAI complaint raises questions

In the age of artificial intelligence, situations are emerging that challenge the laws over rights to a persona.

Nicola Jones

Correspondence | 28 May 2024

Anglo-American bias could make generative AI an invisible intellectual cage

Queenie Luo
& Michael Puett

News Feature | 28 May 2024

The AI revolution is coming to robots: how will it change them?

The melding of artificial intelligence and robotics could catapult both fields to new heights.

Nature Podcast | 24 May 2024

Audio long read: How does ChatGPT ‘think’? Psychology and neuroscience crack open AI large language models

To understand the 'brains' of LLMs, researchers are attempting to reverse-engineering artificial intelligence systems.

Matthew Hutson
& Benjamin Thompson

News | 23 May 2024

Superstar porous materials get salty thanks to computer simulations

Model predicts the structure of previously elusive compounds with practical applications.

Ariana Remmel

Editorial | 22 May 2024

AlphaFold3 — why did Nature publish it without its code?

Criticism of our decision to publish AlphaFold3 raises important questions. We welcome readers’ views.

News & Views | 22 May 2024

AI networks reveal how flies find a mate

Artificial neural networks that model the visual system of a male fruit fly can accurately predict the insect’s behaviour in response to seeing a potential mate — paving the way for the building of more complex models of brain circuits.

Pavan Ramdya

News | 22 May 2024

China’s ChatGPT: why China is building its own AI chatbots

ChatGLM is one of hundreds of AI language models being developed for the Chinese language. It comes close to ChatGPT on many measures, say its creators.

Celeste Biever

News | 21 May 2024

First ‘bilingual’ brain-reading device decodes Spanish and English words

Artificial-intelligence system allows a man who cannot speak coherently to have a conversation in the language of his choice.

Amanda Heidt

Technology Feature | 20 May 2024

DeepLabCut: the motion-tracking tool that went viral

Mackenzie and Alexander Mathis were still early in their careers when their software created a sensation. Now they’re using it to support other young scientists.

Jyoti Madhusoodanan

World View | 14 May 2024

Why mathematics is set to be revolutionized by AI

Cheap data and the absence of coincidences make maths an ideal testing ground for AI-assisted discovery — but only humans will be able to tell good conjectures from bad ones.

Thomas Fink

News Feature | 14 May 2024

How does ChatGPT ‘think’? Psychology and neuroscience crack open AI large language models

Researchers are striving to reverse-engineer artificial intelligence and scan the ‘brains’ of LLMs to see what they are doing, how and why.

News Q&A | 09 May 2024

The US Congress is taking on AI — this computer scientist is helping

Kiri Wagstaff, who temporarily shelved her academic career to provide advice on federal AI legislation, talks about life inside the halls of power.

Research Briefing | 08 May 2024

‘Ghost roads’ could be the biggest direct threat to tropical forests

By using volunteers to map roads in forests across Borneo, Sumatra and New Guinea, an innovative study shows that existing maps of the Asia-Pacific region are rife with errors. It also reveals that unmapped roads are extremely common — up to seven times more abundant than mapped ones. Such ‘ghost roads’ are promoting illegal logging, mining, wildlife poaching and deforestation in some of the world’s biologically richest ecosystems.

News | 08 May 2024

Major AlphaFold upgrade offers boost for drug discovery

Latest version of the AI models how proteins interact with other molecules — but DeepMind restricts access to the tool.

Ewen Callaway

Article 08 May 2024 | Open Access

Accurate structure prediction of biomolecular interactions with AlphaFold 3

AlphaFold 3 has a substantially updated architecture that is capable of predicting the joint structure of complexes including proteins, nucleic acids, small molecules, ions and modified residues with greatly improved accuracy over many previous specialized tools.

Josh Abramson
, Jonas Adler
& John M. Jumper

News | 03 May 2024

Who’s making chips for AI? Chinese manufacturers lag behind US tech giants

Researchers in China say they are finding themselves five to ten years behind their US counterparts as export restrictions bite.

Jonathan O'Callaghan

News | 29 April 2024

‘ChatGPT for CRISPR’ creates new gene-editing tools

Some of the AI-designed gene editors could be more versatile than those found in nature.

News Feature | 23 April 2024

Lethal AI weapons are here: how can we control them?

Autonomous weapons guided by artificial intelligence are already in use. Researchers, legal experts and ethicists are struggling with what should be allowed on the battlefield.

Comment | 22 April 2024

Will AI accelerate or delay the race to net-zero emissions?

As artificial intelligence transforms the global economy, researchers need to explore scenarios to assess how it can help, rather than harm, the climate.

, Jonathan Koomey
& Eric Horvitz

Outlook | 18 April 2024

AI’s keen diagnostic eye

Powered by deep-learning algorithms, artificial intelligence systems could replace agents such as chemicals currently used to augment medical scans.

Neil Savage

Nature Video | 17 April 2024

AI and robotics demystify the workings of a fly's wing

New research unveils the workings of one of the most complex bio-mechanical structures in the natural world

News | 17 April 2024

AI traces mysterious metastatic cancers to their source

Algorithm examines images of metastatic cells to identify the location of the primary tumour.

Smriti Mallapaty

News | 15 April 2024

AI now beats humans at basic tasks — new benchmarks are needed, says major report

Stanford University’s 2024 AI Index charts the meteoric rise of artificial-intelligence tools.

Browse broader subjects

Computational biology and bioinformatics

Quick links

Explore articles by subject
Guide to authors
Editorial policies

Machine Learning - Science topic

Recruit researchers
Join for free
Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Information

Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

Active Journals
Find a Journal
Proceedings Series
For Authors
For Reviewers
For Editors
For Librarians
For Publishers
For Societies
For Conference Organizers
Open Access Policy
Institutional Open Access Program
Special Issues Guidelines
Editorial Process
Research and Publication Ethics
Article Processing Charges
Testimonials
Preprints.org
SciProfiles
Encyclopedia

Article Menu

Subscribe SciFeed
Recommended Articles
Google Scholar
on Google Scholar
Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Machine learning: models, challenges, and research directions.

1. Introduction

Brief discussion of data pre-processing;
Detailed classification of supervised, semi-supervised, unsupervised, and reinforcement learning models;
Study of known optimization techniques;
Challenges of machine learning in the field of cybersecurity.

2. Related Work and Research Methodology

Reference	Year	Study Highlights	Coverage of Data Pre-Processing and Hyperparameter Tuning		Coverage of Machine Learning
Reference	Year	Study Highlights	Data Pre-Processing	Hyperparameter Tuning Approach	Supervised Learning	Unsupervised Learning	Semi-Supervised Learning	Reinforcement Learning
[ ]	2021	Describes the known deep learning models, their principles, and characteristics.			✓	✓
[ ]	2019	Focuses on limited machine learning techniques on only software-defined networking.			✓	✓	✓	✓
[ ]	2022	Investigates the known issues in the field of system designs that can be solved using machine learning techniques.			✓	✓		✓
[ ]	2021	Presents a detailed description of a few supervised models and their optimization techniques.		✓	✓
[ ]	2021	Provides an overview of semi-supervised machine learning techniques with their existing algorithms.					✓
[ ]	2022	Provides the state of the art, challenges, and limitations of supervised models in the field of maritime risk analysis.			✓
[ ]	2022	Reviews hardware architecture of reinforcement learning algorithms.						✓
[ ]	2022	Presents the existing algorithm for wireless sensor networks and describes the existing challenges of using such techniques.			✓
[ ]	2016	Describes most of the known supervised algorithms for classification problems.			✓
[ ]	2019	Provides a description of known supervised and unsupervised models.			✓	✓
[ ]	2021	Discusses supervised and unsupervised deep learning models for intrusion detection systems.			✓	✓
[ ]	2021	Surveys existing supervised and unsupervised techniques in smart grid.			✓	✓
[ ]	2021	Explains known algorithms for image classifications.			✓	✓	✓
[ ]	2022	Illustrates the unsupervised deep learning models and summarizes their challenges.				✓
[ ]	2023	Discusses techniques for energy usage in future			✓	✓	✓	✓
[ ]	2020	Reviews various ML techniques in the security of the Internet of Things.			✓	✓	✓	✓
[ ]	2020	Proposes a taxonomy of machine learning techniques in the security of Internet of Things.			✓	✓	✓	✓
[ ]	2019	Surveys the taxonomy of machine learning models in intrusion detection systems.			✓	✓		✓
[ ]	2022	Gives ML techniques in industrial control systems.	✓		✓	✓	✓	✓
[ ]	2022	Proposes the taxonomy of intrusion detection systems for supervised models.			✓

3. Machine Learning Models

3.1. supervised learning, 3.2. semi-supervised learning, 3.3. unsupervised learning, 3.4. reinforcement learning, 4. machine learning processes, 4.1. data pre-processing, 4.2. tuning approaches, 4.3. evaluation metrics, 4.3.1. evaluation metrics for supervised learning, 4.3.2. evaluation metrics for unsupervised learning models, 4.3.3. evaluation metrics for semi-supervised learning models, 4.3.4. evaluation metrics for reinforcement learning models, 5. challenges and future directions, 6. conclusions, author contributions, data availability statement, conflicts of interest.

Sarker, I.H. Machine Learning: Algorithms, real-world applications and research directions. SN Comput. Sci. 2021 , 2 , 160. [ Google Scholar ] [ CrossRef ]
Vinuesa, R.; Azizpour, H.; Leite, I.; Balaam, M.; Dignum, V.; Domisch, S.; Felländer, A.; Langhans, S.D.; Tegmark, M.; Nerini, F.F. The role of artificial intelligence in achieving the sustainable development goals. Nat. Commun. 2020 , 11 , 233. [ Google Scholar ] [ CrossRef ] [ PubMed ]
Ullah, Z.; Al-Turjman, F.; Mostarda, L.; Gagliardi, R. Applications of artificial intelligence and machine learning in smart cities. Comput. Commun. 2020 , 154 , 313–323. [ Google Scholar ] [ CrossRef ]
Ozcanli, A.K.; Yaprakdal, F.; Baysal, M. Deep learning methods and applications for electrical power systems: A comprehensive review. Int. J. Energy Res. 2020 , 44 , 7136–7157. [ Google Scholar ] [ CrossRef ]
Zhao, S.; Blaabjerg, F.; Wang, H. An Overview of Artificial Intelligence Applications for Power Electronics. IEEE Trans. Power Electron. 2021 , 36 , 4633–4658. [ Google Scholar ] [ CrossRef ]
Mamun, A.A.; Sohel, M.; Mohammad, N.; Sunny, M.S.H.; Dipta, D.R.; Hossain, E. A Comprehensive Review of the Load Fore-casting Techniques Using Single and Hybrid Predictive Models. IEEE Access 2020 , 8 , 134911–134939. [ Google Scholar ] [ CrossRef ]
Massaoudi, M.; Darwish, A.; Refaat, S.S.; Abu-Rub, H.; Toliyat, H.A. UHF Partial Discharge Localization in Gas-Insulated Switch-gears: Gradient Boosting Based Approach. In Proceedings of the 2020 IEEE Kansas Power and Energy Conference (KPEC), Manhattan, KS, USA, 13–14 July 2020; pp. 1–5. [ Google Scholar ]
Ali, S.S.; Choi, B.J. State-of-the-Art Artificial Intelligence Techniques for Distributed Smart Grids: A Review. Electronics 2020 , 9 , 1030. [ Google Scholar ] [ CrossRef ]
Yin, L.; Gao, Q.; Zhao, L.; Zhang, B.; Wang, T.; Li, S.; Liu, H. A review of machine learning for new generation smart dispatch in power systems. Eng. Appl. Artif. Intell. 2020 , 88 , 103372. [ Google Scholar ] [ CrossRef ]
Peng, S.; Sun, S.; Yao, Y.-D. A Survey of Modulation Classification Using Deep Learning: Signal Representation and Data Prepro-cessing. In IEEE Transactions on Neural Networks and Learning Systems ; IEEE: New York, NY, USA, 2021. [ Google Scholar ]
Arjoune, Y.; Kaabouch, N. A Comprehensive Survey on Spectrum Sensing in Cognitive Radio Networks: Recent Advances, New Challenges, and Future Research Directions. Sensors 2019 , 19 , 126. [ Google Scholar ] [ CrossRef ]
Meng, T.; Jing, X.; Yan, Z.; Pedrycz, W. A survey on machine learning for data fusion. Inf. Fusion 2020 , 57 , 115–129. [ Google Scholar ] [ CrossRef ]
Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019 , 8 , 832. [ Google Scholar ] [ CrossRef ]
Khoei, T.T.; Ismail, S.; Kaabouch, N. Boosting-based Models with Tree-structured Parzen Estimator Optimization to Detect Intrusion Attacks on Smart Grid. In Proceedings of the 2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 1–4 December 2021; pp. 165–170. [ Google Scholar ] [ CrossRef ]
Hutter, F.; Lücke, J.; Schmidt-Thieme, L. Beyond manual tuning of hyperparameters. KI-Künstliche Intell. 2015 , 29 , 329–337. [ Google Scholar ] [ CrossRef ]
Khoei, T.T.; Aissou, G.; Hu, W.C.; Kaabouch, N. Ensemble Learning Methods for Anomaly Intrusion Detection System in Smart Grid. In Proceedings of the IEEE International Conference on Electro Information Technology (EIT), Mt. Pleasant, MI, USA, 14–15 May 2021; pp. 129–135. [ Google Scholar ] [ CrossRef ]
Waubert de Puiseau, C.; Meyes, R.; Meisen, T. On reliability of reinforcement learning based production scheduling systems: A comparative survey. J. Intell. Manuf. 2022 , 33 , 911–927. [ Google Scholar ] [ CrossRef ]
Moos, J.; Hansel, K.; Abdulsamad, H.; Stark, S.; Clever, D.; Peters, J. Robust Reinforcement Learning: A Review of Foundations and Recent Advances. Mach. Learn. Knowl. Extr. 2022 , 4 , 276–315. [ Google Scholar ] [ CrossRef ]
Latif, S.; Cuayáhuitl, H.; Pervez, F.; Shamshad, F.; Ali, H.S.; Cambria, E. A survey on deep reinforcement learning for audio-based applications. Artif. Intell. Rev. 2022 , 56 , 2193–2240. [ Google Scholar ] [ CrossRef ]
Passah, A.; Kandar, D. A lightweight deep learning model for classification of synthetic aperture radar images. Ecol. Inform. 2023 , 77 , 102228. [ Google Scholar ] [ CrossRef ]
Verbraeken, J.; Wolting, M.; Katzy, J.; Kloppenburg, J.; Verbelen, T.; Rellermeyer, J.S. A survey on distributed machine learning. ACM Comput. Surv. 2020 , 53 , 1–33. [ Google Scholar ] [ CrossRef ]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng. 2020 , 27 , 1071–1092. [ Google Scholar ] [ CrossRef ]
Pitropakis, N.; Panaousis, E.; Giannetsos, T.; Anastasiadis, E.; Loukas, G. A taxonomy and survey of attacks against machine learning. Comput. Sci. Rev. 2019 , 34 , 100199. [ Google Scholar ] [ CrossRef ]
Wu, X.; Xiao, L.; Sun, Y.; Zhang, J.; Ma, T.; He, L. A survey of human-in-the-loop for machine learning. Futur. Gener. Comput. Syst. 2022 , 135 , 364–381. [ Google Scholar ] [ CrossRef ]
Wang, Q.; Ma, Y.; Zhao, K.; Tian, Y. A comprehensive survey of loss functions in machine learning. Ann. Data Sci. 2022 , 9 , 187–212. [ Google Scholar ] [ CrossRef ]
Choi, H.; Park, S. A Survey of Machine Learning-Based System Performance Optimization Techniques. Appl. Sci. 2021 , 11 , 3235. [ Google Scholar ] [ CrossRef ]
Rawson, A.; Brito, M. A survey of the opportunities and challenges of supervised machine learning in maritime risk analysis. Transp. Rev. 2022 , 43 , 108–130. [ Google Scholar ] [ CrossRef ]
Ahmad, R.; Wazirali, R.; Abu-Ain, T. Machine Learning for Wireless Sensor Networks Security: An Overview of Challenges and Issues. Sensors 2022 , 22 , 4730. [ Google Scholar ] [ CrossRef ] [ PubMed ]
Singh, A.; Thakur, N.; Sharma, A. A review of supervised machine learning algorithms. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16–18 March 2016; pp. 1310–1315. [ Google Scholar ]
Abdallah, E.E.; Eleisah, W.; Otoom, A.F. Intrusion Detection Systems using Supervised Machine Learning Techniques: A survey. Procedia Comput. Sci. 2022 , 201 , 205–212. [ Google Scholar ] [ CrossRef ]
Dike, H.U.; Zhou, Y.; Deveerasetty, K.K.; Wu, Q. Unsupervised Learning Based On Artificial Neural Network: A Review. In Proceedings of the 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS), 25–27 October 2018; pp. 322–327. [ Google Scholar ]
van Engelen, J.E.; Hoos, H.H. A survey on semi-supervised learning. Mach. Learn. 2020 , 109 , 373–440. [ Google Scholar ] [ CrossRef ]
Rothmann, M.; Porrmann, M. A Survey of Domain-Specific Architectures for Reinforcement Learning. IEEE Access 2022 , 10 , 13753–13767. [ Google Scholar ] [ CrossRef ]
Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2020 , 40 , 100379. [ Google Scholar ] [ CrossRef ]
Ray, S. A Quick Review of Machine Learning Algorithms. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 35–39. [ Google Scholar ]
Lansky, J.; Ali, S.; Mohammadi, M.; Majeed, M.K.; Karim, S.H.T.; Rashidi, S.; Hosseinzadeh, M.; Rahmani, A.M. Deep Learning-Based Intrusion Detection Systems: A Systematic Review. IEEE Access 2021 , 9 , 101574–101599. [ Google Scholar ] [ CrossRef ]
Massaoudi, M.; Abu-Rub, H.; Refaat, S.S.; Chihi, I.; Oueslati, F.S. Deep Learning in Smart Grid Technology: A Review of Recent Advancements and Future Prospects. IEEE Access 2021 , 9 , 54558–54578. [ Google Scholar ] [ CrossRef ]
Liu, H.; Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019 , 9 , 4396. [ Google Scholar ] [ CrossRef ]
Wu, N.; Xie, Y. A survey of machine learning for computer architecture and systems. ACM Comput. Surv. 2022 , 55 , 1–39. [ Google Scholar ] [ CrossRef ]
Schmarje, L.; Santarossa, M.; Schröder, S.-M.; Koch, R. A Survey on Semi-, Self- and Unsupervised Learning for Image Classification. IEEE Access 2021 , 9 , 82146–82168. [ Google Scholar ] [ CrossRef ]
Xie, J.; Yu, F.R.; Huang, T.; Xie, R.; Liu, J.; Wang, C.; Liu, Y. A Survey of Machine Learning Techniques Applied to Software Defined Networking (SDN): Research Issues and Challenges. In IEEE Communications Surveys & Tutorials ; IEEE: New York, NY, USA, 2019; Volume 21, pp. 393–430. [ Google Scholar ]
Yao, Z.; Lum, Y.; Johnston, A.; Mejia-Mendoza, L.M.; Zhou, X.; Wen, Y.; Aspuru-Guzik, A.; Sargent, E.H.; Seh, Z.W. Machine learning for a sustainable energy future. Nat. Rev. Mater. 2023 , 8 , 202–215. [ Google Scholar ] [ CrossRef ]
Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M. A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security. In IEEE Communications Surveys & Tutorials ; IEEE: New York, NY, USA, 2020; Volume 22, pp. 1646–1685. [ Google Scholar ]
Messaoud, S.; Bradai, A.; Bukhari, S.H.R.; Quang, P.T.A.; Ahmed, O.B.; Atri, M. A survey on machine learning in internet of things: Algorithms, strategies, and applications. Internet Things 2020 , 12 , 100314. [ Google Scholar ] [ CrossRef ]
Umer, M.A.; Junejo, K.N.; Jilani, M.T.; Mathur, A.P. Machine learning for intrusion detection in industrial control systems: Ap-plications, challenges, and recommendations. Int. J. Crit. Infrastruct. Prot. 2022 , 38 , 100516. [ Google Scholar ] [ CrossRef ]
Von Rueden, L.; Mayer, S.; Garcke, J.; Bauckhage, C.; Schuecker, J. Informed machine learning–towards a taxonomy of explicit integration of knowledge into machine learning. Learning 2019 , 18 , 19–20. [ Google Scholar ]
Waring, J.; Lindvall, C.; Umeton, R. Automated machine learning: Review of the state-of-the-art and opportunities for healthcare. Artif. Intell. Med. 2020 , 104 , 101822. [ Google Scholar ] [ CrossRef ]
Wang, H.; Lv, L.; Li, X.; Li, H.; Leng, J.; Zhang, Y.; Thomson, V.; Liu, G.; Wen, X.; Luo, G. A safety management approach for Industry 5.0′ s human-centered manufacturing based on digital twin. J. Manuf. Syst. 2023 , 66 , 1–12. [ Google Scholar ] [ CrossRef ]
Reuther, A.; Michaleas, P.; Jones, M.; Gadepally, V.; Samsi, S.; Kepner, J. Survey and Benchmarking of Machine Learning Accelerators. In Proceedings of the 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA USA, 24–26 September 2019; pp. 1–9. [ Google Scholar ]
Kaur, B.; Dadkhah, S.; Shoeleh, F.; Neto, E.C.P.; Xiong, P.; Iqbal, S.; Lamontagne, P.; Ray, S.; Ghorbani, A.A. Internet of Things (IoT) security dataset evolution: Challenges and future directions. Internet Things 2023 , 22 , 100780. [ Google Scholar ] [ CrossRef ]
Paullada, A.; Raji, I.D.; Bender, E.M.; Denton, E.; Hanna, A. Data and its (dis)contents: A survey of dataset development and use in machine learning research. Patterns 2021 , 2 , 100336. [ Google Scholar ] [ CrossRef ] [ PubMed ]
Slimane, H.O.; Benouadah, S.; Khoei, T.T.; Kaabouch, N. A Light Boosting-based ML Model for Detecting Deceptive Jamming Attacks on UAVs. In Proceedings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 26–29 January 2022; pp. 328–333. [ Google Scholar ]
Manesh, M.R.; Kenney, J.; Hu, W.C.; Devabhaktuni, V.K.; Kaabouch, N. Detection of GPS spoofing attacks on unmanned aerial systems. In Proceedings of the 16th IEEE Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 11–14 January 2019; pp. 1–6. [ Google Scholar ]
Sharifani, K.; Amini, M. Machine Learning and Deep Learning: A Review of Methods and Applications. World Inf. Technol. Eng. J. 2023 , 10 , 3897–3904. [ Google Scholar ]
Obaid, H.S.; Dheyab, S.A.; Sabry, S.S. The Impact of Data Pre-Processing Techniques and Dimensionality Reduction on the Ac-curacy of Machine Learning. In Proceedings of the 2019 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON), Jaipur, India, 13–15 March 2019; pp. 279–283. [ Google Scholar ]
Liu, B.; Ding, M.; Shaham, S.; Rahayu, W.; Lin, Z. When machine learning meets privacy: A survey and outlook. ACM Comput. Surv. (CSUR) 2021 , 54 , 1–36. [ Google Scholar ] [ CrossRef ]
Singh, S.; Gupta, P. Comparative study ID3, cart and C4. 5 decision tree algorithm: A survey. Int. J. Adv. Inf. Sci. Technol. (IJAIST) 2014 , 27 , 97–103. [ Google Scholar ]
Zhang, M.-L.; Zhou, Z.-H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007 , 40 , 2038–2048. [ Google Scholar ] [ CrossRef ]
Musavi, M.T.; Ahmed, W.; Chan, K.H.; Faris, K.B.; Hummels, D.M. On the training of radial basis function classifiers. Neural Netw. 1992 , 5 , 595–603. [ Google Scholar ] [ CrossRef ]
Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics. Electronics 2021 , 10 , 593. [ Google Scholar ] [ CrossRef ]
Jiang, T.; Fang, H.; Wang, H. Blockchain-Based Internet of Vehicles: Distributed Network Architecture and Performance Analy-sis. IEEE Internet Things J. 2019 , 6 , 4640–4649. [ Google Scholar ] [ CrossRef ]
Jia, W.; Dai, D.; Xiao, X.; Wu, H. ARNOR: Attention regularization based noise reduction for distant supervision relation classifi-cation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1399–1408. [ Google Scholar ]
Abiodun, O.I.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018 , 4 , e00938. [ Google Scholar ] [ CrossRef ]
Izeboudjen, N.; Larbes, C.; Farah, A. A new classification approach for neural networks hardware: From standards chips to embedded systems on chip. Artif. Intell. Rev. 2014 , 41 , 491–534. [ Google Scholar ] [ CrossRef ]
Wang, D.; He, H.; Liu, D. Intelligent Optimal Control With Critic Learning for a Nonlinear Overhead Crane System. IEEE Trans. Ind. Informatics 2018 , 14 , 2932–2940. [ Google Scholar ] [ CrossRef ]
Wang, S.-C. Artificial Neural Network. In Interdisciplinary Computing in Java Programming ; Springer: Berlin/Heidelberg, Germany, 2003; pp. 81–100. [ Google Scholar ]
Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017. [ Google Scholar ]
Khoei, T.T.; Slimane, H.O.; Kaabouch, N. Cyber-Security of Smart Grids: Attacks, Detection, Countermeasure Techniques, and Future Directions. Commun. Netw. 2022 , 14 , 119–170. [ Google Scholar ] [ CrossRef ]
Gunturi, S.K.; Sarkar, D. Ensemble machine learning models for the detection of energy theft. Electr. Power Syst. Res. 2021 , 192 , 106904. [ Google Scholar ] [ CrossRef ]
Chafii, M.; Bader, F.; Palicot, J. Enhancing coverage in narrow band-IoT using machine learning. In Proceedings of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain, 15–18 April 2018; pp. 1–6. [ Google Scholar ]
Bithas, P.S.; Michailidis, E.T.; Nomikos, N.; Vouyioukas, D.; Kanatas, A.G. A Survey on Machine-Learning Techniques for UAV-Based Communications. Sensors 2019 , 19 , 5170. [ Google Scholar ] [ CrossRef ] [ PubMed ]
Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors 2021 , 21 , 3758. [ Google Scholar ] [ CrossRef ]
Wagle, P.P.; Rani, S.; Kowligi, S.B.; Suman, B.H.; Pramodh, B.; Kumar, P.; Raghavan, S.; Shastry, K.A.; Sanjay, H.A.; Kumar, M.; et al. Machine Learning-Based Ensemble Network Security System. In Recent Advances in Artificial Intelligence and Data Engineering ; Springer: Berlin/Heidelberg, Germany, 2022; pp. 3–15. [ Google Scholar ]
Sutton, C.D. Classification and regression trees, bagging, and boosting. Handb. Stat. 2005 , 24 , 303–329. [ Google Scholar ]
Zaadnoordijk, L.; Besold, T.R.T.; Cusack, R. Lessons from infant learning for unsupervised machine learning. Nat. Mach. Intell. 2022 , 4 , 510–520. [ Google Scholar ] [ CrossRef ]
Khoei, T.T.; Kaabouch, N. A Comparative Analysis of Supervised and Unsupervised Models for Detecting Attacks on the Intrusion Detection Systems. Information 2023 , 14 , 103. [ Google Scholar ] [ CrossRef ]
Kumar, P.; Gupta, G.P.; Tripathi, R. An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks. Comput. Commun. 2021 , 166 , 110–124. [ Google Scholar ] [ CrossRef ]
Hady, M.; Abdel, A.M.F.; Schwenker, F. Semi-supervised learning. In Handbook on Neural Information Processing ; Springer: Berlin/Heidelberg, Germany, 2013. [ Google Scholar ]
Elsken, T.; Metzen, J.H.; Hutter, F. Neural architecture search: A survey. J. Mach. Learn. Res. 2019 , 20 , 1–21. [ Google Scholar ]
Luo, Y.; Zhu, J.; Li, M.; Ren, Y.; Zhang, B. Smooth neighbors on teacher graphs for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Lake City, UT, USA, 18–22 June 2018; pp. 8896–8905. [ Google Scholar ]
Park, S.; Park, J.; Shin, S.; Moon, I. Adversarial dropout for supervised and semi-supervised learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 3917–3924. [ Google Scholar ]
Khoei, T.T.; Kaabouch, N. ACapsule Q-learning based reinforcement model for intrusion detection system on smart grid. In Proceedings of the IEEE International Conference on Electro Information Technology (eIT), Romeoville, IL, USA, 18–20 May 2023; pp. 333–339. [ Google Scholar ]
Polydoros, A.S.; Nalpantidis, L. Survey of model-based reinforcement learning: Applications on robotics. J. Intell. Robot. Syst. 2017 , 86 , 153–173. [ Google Scholar ] [ CrossRef ]
Degris, T.; Pilarski, P.M.; Sutton, R.S. Model-Free reinforcement learning with continuous action in practice. In Proceedings of the 2012 American Control Conference (ACC), Montreal, QC, Canada, 27–29 June 2012; pp. 2177–2182. [ Google Scholar ] [ CrossRef ]
Cao, D.; Hu, W.; Zhao, J.; Zhang, G.; Zhang, B.; Liu, Z.; Chen, Z.; Blaabjerg, F. Reinforcement learning and its applications in modern power and energy systems: A review. J. Mod. Power Syst. Clean Energy 2020 , 8 , 1029–1042. [ Google Scholar ] [ CrossRef ]
Zhang, J.M.; Harman, M.; Ma, L.; Liu, Y. Machine Learning Testing: Survey, Landscapes and Horizons. In IEEE Transactions on Software Engineering ; IEEE: New York, NY, USA, 2022; Volume 48, pp. 1–36. [ Google Scholar ]
Salahdine, F.; Kaabouch, N. Security threats, detection, and countermeasures for physical layer in cognitive radio networks: A survey. Phys. Commun. 2020 , 39 , 101001. [ Google Scholar ] [ CrossRef ]
Ramírez, J.; Yu, W.; Perrusquía, A. Model-free reinforcement learning from expert demonstrations: A survey. Artif. Intell. Rev. 2022 , 55 , 3213–3241. [ Google Scholar ] [ CrossRef ]
Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020 , 415 , 295–316. [ Google Scholar ] [ CrossRef ]
Dev, K.; Maddikunta, P.K.R.; Gadekallu, T.R.; Bhattacharya, S.; Hegde, P.; Singh, S. Energy Optimization for Green Communication in IoT Using Harris Hawks Optimization. In IEEE Transactions on Green Communications and Networking ; IEEE: New York, NY, USA, 2022; Volume 6, pp. 685–694. [ Google Scholar ]
Khodadadi, N.; Snasel, V.; Mirjalili, S. Dynamic Arithmetic Optimization Algorithm for Truss Optimization Under Natural Fre-quency Constraints. IEEE Access 2022 , 10 , 16188–16208. [ Google Scholar ] [ CrossRef ]
Cummins, C.; Wasti, B.; Guo, J.; Cui, B.; Ansel, J.; Gomez, S.; Jain, S.; Liu, J.; Teytaud, O.; Steinerm, B.; et al. CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research. In Proceedings of the 2022 IEEE/ACM In-ternational Symposium on Code Generation and Optimization (CGO), Seoul, Republic of Korea, 2–6 April 2022; pp. 92–105. [ Google Scholar ]
Zhang, W.; Gu, X.; Tang, L.; Yin, Y.; Liu, D.; Zhang, Y. Application of machine learning, deep learning and optimization algo-rithms in geoengineering and geoscience: Comprehensive review and future challenge. Gondwana Res. 2022 , 109 , 1–17. [ Google Scholar ] [ CrossRef ]
Mittal, S.; Vaishay, S. A survey of techniques for optimizing deep learning on GPUs. J. Syst. Arch. 2019 , 99 , 101635. [ Google Scholar ] [ CrossRef ]
Zhang, Q.; Yang, L.T.; Chen, Z.; Li, P. A survey on deep learning for big data. Inf. Fusion 2018 , 42 , 146–157. [ Google Scholar ] [ CrossRef ]
Oyelade, O.N.; Ezugwu, A.E.-S.; Mohamed, T.I.A.; Abualigah, L. Ebola Optimization Search Algorithm: A New Nature-Inspired Metaheuristic Optimization Algorithm. IEEE Access 2022 , 10 , 16150–16177. [ Google Scholar ] [ CrossRef ]
Blank, J.; Deb, K. Pymoo: Multi-Objective Optimization in Python. IEEE Access 2020 , 8 , 89497–89509. [ Google Scholar ] [ CrossRef ]
Qiao, K.; Yu, K.; Qu, B.; Liang, J.; Song, H.; Yue, C. An Evolutionary Multitasking Optimization Framework for Constrained Multi-objective Optimization Problems. IEEE Trans. Evol. Comput. 2022 , 26 , 263–277. [ Google Scholar ] [ CrossRef ]
Riaz, M.; Ahmad, S.; Hussain, I.; Naeem, M.; Mihet-Popa, L. Probabilistic Optimization Techniques in Smart Power System. Energies 2022 , 15 , 825. [ Google Scholar ] [ CrossRef ]
Yu, T.; Zhu, H. Hyper-parameter optimization: A review of algorithms and applications. arXiv 2020 , arXiv:2003.05689. [ Google Scholar ]
Yang, X.; Song, Z.; King, I.; Xu, Z. A Survey on deep semi-supervised learning. arXiv 2021 , arXiv:2103.00550. [ Google Scholar ] [ CrossRef ]
Gibson, B.R.; Rogers, T.T.; Zhu, X. Human semi-supervised learning. Top. Cogn. Sci. 2013 , 5 , 132–172. [ Google Scholar ] [ CrossRef ]
Nguyen, T.T.; Nguyen, N.D.; Nahavandi, S. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications. IEEE Trans. Cybern. 2020 , 50 , 3826–3839. [ Google Scholar ] [ CrossRef ]
Canese, L.; Cardarilli, G.C.; Di Nunzio, L.; Fazzolari, R.; Giardino, D.; Re, M.; Spanò, S. Multi-Agent Reinforcement Learning: A Review of Challenges and Applications. Appl. Sci. 2021 , 11 , 4948. [ Google Scholar ] [ CrossRef ]
Du, W.; Ding, S. A survey on multi-agent deep reinforcement learning: From the perspective of challenges and applications. Artif. Intell. Rev. 2020 , 54 , 3215–3238. [ Google Scholar ] [ CrossRef ]
Salwan, D.; Kant, S.; Pareek, H.; Sharma, R. Challenges with reinforcement learning in prosthesis. Mater. Today Proc. 2022 , 49 , 3133–3136. [ Google Scholar ] [ CrossRef ]
Narkhede, M.S.; Chatterji, S.; Ghosh, S. Trends and challenges in optimization techniques for operation and control of Mi-crogrid—A review. In Proceedings of the 2012 1st International Conference on Power and Energy in NERIST (ICPEN), Nirjuli, India, 28–29 December 2012; pp. 1–7. [ Google Scholar ]
Khoei, T.T.; Ismail, S.; Kaabouch, N. Dynamic Selection Techniques for Detecting GPS Spoofing Attacks on UAVs. Sensors 2022 , 22 , 662. [ Google Scholar ] [ CrossRef ]
Khoei, T.T.; Ismail, S.; Al Shamaileh, K.; Devabhaktuni, V.K.; Kaabouch, N. Impact of Dataset and Model Parameters on Machine Learning Performance for the Detection of GPS Spoofing Attacks on Unmanned Aerial Vehicles. Appl. Sci. 2022 , 13 , 383. [ Google Scholar ] [ CrossRef ]
Khoei, T.T.; Kaabouch, N. Densely Connected Neural Networks for Detecting Denial of Service Attacks on Smart Grid Network. In Proceedings of the IEEE 13th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 26–29 October 2022; pp. 0207–0211. [ Google Scholar ]
Khan, A.; Khan, S.H.; Saif, M.; Batool, A.; Sohail, A.; Khan, M.W. A Survey of Deep Learning Techniques for the Analysis of COVID-19 and their usability for Detecting Omicron. J. Exp. Theor. Artif. Intell. 2023 , 1–43. [ Google Scholar ] [ CrossRef ]
Gopinath, M.; Sethuraman, S.C. A comprehensive survey on deep learning based malware detection techniques. Comput. Sci. Rev. 2023 , 47 , 100529. [ Google Scholar ]
Gheisari, M.; Ebrahimzadeh, F.; Rahimi, M.; Moazzamigodarzi, M.; Liu, Y.; Pramanik, P.K.D.; Heravi, M.A.; Mehbodniya, A.; Ghaderzadeh, M.; Feylizadeh, M.R.; et al. Deep learning: Applications, architectures, models, tools, and frameworks: A com-prehensive survey. In CAAI Transactions on Intelligence Technology ; IET: Stevenage, UK, 2023. [ Google Scholar ]
Morgan, D.; Jacobs, R. Opportunities and challenges for machine learning in materials science. Annu. Rev. Mater. Res. 2020 , 50 , 71–103. [ Google Scholar ] [ CrossRef ]
Phoon, K.K.; Zhang, W. Future of machine learning in geotechnics. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2023 , 17 , 7–22. [ Google Scholar ] [ CrossRef ]
Krishnam, N.P.; Ashraf, M.S.; Rajagopal, B.R.; Vats, P.; Chakravarthy, D.S.K.; Rafi, S.M. Analysis of Current Trends, Advances and Challenges of Machine Learning (Ml) and Knowledge Extraction: From Ml to Explainable AI. Ind. Qualif.-Stitute Adm. Manag. UK 2022 , 58 , 54–62. [ Google Scholar ]
Li, Z.; Yoon, J.; Zhang, R.; Rajabipour, F.; Srubar, W.V., III; Dabo, I.; Radlińska, A. Machine learning in concrete science: Applications, challenges, and best practices. NPJ Comput. Mater. 2022 , 8 , 127. [ Google Scholar ] [ CrossRef ]
Houssein, E.H.; Abohashima, Z.; Elhoseny, M.; Mohamed, W.M. Machine learning in the quantum realm: The state-of-the-art, challenges, and future vision. Expert Syst. Appl. 2022 , 194 , 116512. [ Google Scholar ] [ CrossRef ]
Khan, T.; Tian, W.; Zhou, G.; Ilager, S.; Gong, M.; Buyya, R. Machine learning (ML)-centric resource management in cloud computing: A review and future directions. J. Netw. Comput. Appl. 2022 , 204 , 103405. [ Google Scholar ] [ CrossRef ]
Esterhuizen, J.A.; Goldsmith, B.R.; Linic, S. Interpretable machine learning for knowledge generation in heterogeneous catalysis. Nat. Catal. 2022 , 5 , 175–184. [ Google Scholar ] [ CrossRef ]
Bharadiya, J.P. Leveraging Machine Learning for Enhanced Business Intelligence. Int. J. Comput. Sci. Technol. 2023 , 7 , 1–19. [ Google Scholar ]
Talaei Khoei, T.; Ould Slimane, H.; Kaabouch, N. Deep learning: Systematic review, models, challenges, and research directions. In Neural Computing and Applications ; Springer: Berlin/Heidelberg, Germany, 2023; pp. 1–22. [ Google Scholar ]
Ben Amor, S.; Belaid, F.; Benkraiem, R.; Ramdani, B.; Guesmi, K. Multi-criteria classification, sorting, and clustering: A bibliometric review and research agenda. Ann. Oper. Res. 2023 , 325 , 771–793. [ Google Scholar ] [ CrossRef ]
Valdez, F.; Melin, P. A review on quantum computing and deep learning algorithms and their applications. Soft Comput. 2023 , 27 , 13217–13236. [ Google Scholar ] [ CrossRef ]
Fihri, W.F.; Arjoune, Y.; Hassan El Ghazi, H.; Kaabouch, N.; Abou El Majd, A.B. A particle swarm optimization based algorithm for primary user emulation attack detection. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 823–827. [ Google Scholar ]

Click here to enlarge figure

Classification Category	Characteristics	Advantages	Disadvantages
Bayesian- Based
Tree- based
Instance- based
Regularization-based
Neural network-based
Ensemble-based

Classification Category	Characteristics	Advantage	Disadvantage
Inductive- Based	Generates a model that can create predictions for any sample in the input space	The predictions of new samples are independent of old samples	The same model can be used in training and predicting new data samples
Transductive- based	Predictive strengths are limited to objects that are processed during the training steps	No difference between the training and testing steps	No distinction between the transductive algorithms in a supervised manner

Classification Category	Characteristics	Advantages	Disadvantages
Cluster-based	Divides uncategorized data into similar groups;
Dimensionality reduction-based	Decreases the number of features in the given dataset;
Neural network-based	Inspiration of human brains.

Classification Category	Characteristics	Advantage	Disadvantage
Model-based	Optimal actions are learned via a model
Model free-based	No transition of a probability distribution or reward associated with the Markov decision process

Data Preprocessing Steps	Methodology	Technique	Highlights
Data transformation	Standardization and normalization	Unit vector normalization	Extract the given data, and convert them to a usable format
		Max abs scalar
		Quantile transformer scalar
		Robust scalar Min-max scaling
		Power transformer scalar
		Unit vector normalization
		Standard scalar
Data cleaning	Missing value imputation	Complete case analysis	Loss of efficiency, strong bias, and complications in handling data.
		Frequent category imputation
		Mean/median imputation
		Mode imputation
		End of tail imputation
		Nearest neighbor imputation
		Iterative imputation
		Hot and cold deck imputation
		Exploration imputation
		Interpolation imputation
		Regression-based imputation
	Noise treatment	Data polishing
	Noise treatment	Noise filters
Data reduction/ increasing	Feature selection	Wrapper	Decrease or increase the number of samples or features that are not important in the process of training
		Filter
		Embedded
	Feature extraction	Principle component analysis
		Linear discriminative analysis
		Independent component analysis
		Partial least square
		Multifactor dimensionality reduction
		Nonlinear dimensionality reduction
		Autoencoder
		Tensor decomposition
	Instance generation	Condensation algorithms
		Edition algorithms
		Hybrid algorithms
Discretization	Discretization-based	Chi-squared discretization	Loss of information, simplicity, readability, and faster learning process
Discretization	Discretization-based	Efficient discretization
Imbalanced learning	Under-sampling	Random under-sampling	Presents true evaluation results
		Tomek links
		Condensed nearest neighbor
		Edited nearest neighbor
		Near-miss under-sampling
	Oversampling	Random oversampling
		Synthetic minority oversampling technique
		Adaptive synthetic
		Borderline-synthetic minority oversampling technique

Hyperparameter Methods	Strengths	Limitations
Grid search
Random search
Genetic algorithm
Gradient-based techniques
Bayesian optimization-Gaussian process
Particle swarm optimization
Bayesian optimization-tree structure parzen estimator
Hyperband
Bayesian optimization-SMAC
Population-based

Category	Metric Name
Supervised Learning
Unsupervised Learning
Semi-Supervised Learning
Reinforcement Learning

Challenges	Descriptions
Interpretability and Explain-ability
Bias and Fairness
Adversarial Robustness
Privacy and Security
Reinforcement Learning
Quantum Computing
Multi-Criteria Models

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Talaei Khoei, T.; Kaabouch, N. Machine Learning: Models, Challenges, and Research Directions. Future Internet 2023 , 15 , 332. https://doi.org/10.3390/fi15100332

Talaei Khoei T, Kaabouch N. Machine Learning: Models, Challenges, and Research Directions. Future Internet . 2023; 15(10):332. https://doi.org/10.3390/fi15100332

Talaei Khoei, Tala, and Naima Kaabouch. 2023. "Machine Learning: Models, Challenges, and Research Directions" Future Internet 15, no. 10: 332. https://doi.org/10.3390/fi15100332

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

Subscribe to receive issue release notifications and newsletters from MDPI journals

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Springer Nature - PMC COVID-19 Collection

Machine Learning: Algorithms, Real-World Applications and Research Directions

Iqbal h. sarker.

1 Swinburne University of Technology, Melbourne, VIC 3122 Australia

2 Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, 4349 Chattogram, Bangladesh

In the current age of the Fourth Industrial Revolution (4 IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning , which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study’s key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.

Introduction

We live in the age of data, where everything around us is connected to a data source, and everything in our lives is digitally recorded [ 21 , 103 ]. For instance, the current electronic world has a wealth of various kinds of data, such as the Internet of Things (IoT) data, cybersecurity data, smart city data, business data, smartphone data, social media data, health data, COVID-19 data, and many more. The data can be structured, semi-structured, or unstructured, discussed briefly in Sect. “ Types of Real-World Data and Machine Learning Techniques ”, which is increasing day-by-day. Extracting insights from these data can be used to build various intelligent applications in the relevant domains. For instance, to build a data-driven automated and intelligent cybersecurity system, the relevant cybersecurity data can be used [ 105 ]; to build personalized context-aware smart mobile applications, the relevant mobile data can be used [ 103 ], and so on. Thus, the data management tools and techniques having the capability of extracting insights or useful knowledge from the data in a timely and intelligent way is urgently needed, on which the real-world applications are based.

Artificial intelligence (AI), particularly, machine learning (ML) have grown rapidly in recent years in the context of data analysis and computing that typically allows the applications to function in an intelligent manner [ 95 ]. ML usually provides systems with the ability to learn and enhance from experience automatically without being specifically programmed and is generally referred to as the most popular latest technologies in the fourth industrial revolution (4 IR or Industry 4.0) [ 103 , 105 ]. “Industry 4.0” [ 114 ] is typically the ongoing automation of conventional manufacturing and industrial practices, including exploratory data processing, using new smart technologies such as machine learning automation. Thus, to intelligently analyze these data and to develop the corresponding real-world applications, machine learning algorithms is the key. The learning algorithms can be categorized into four major types, such as supervised, unsupervised, semi-supervised, and reinforcement learning in the area [ 75 ], discussed briefly in Sect. “ Types of Real-World Data and Machine Learning Techniques ”. The popularity of these approaches to learning is increasing day-by-day, which is shown in Fig. Fig.1, 1 , based on data collected from Google Trends [ 4 ] over the last five years. The x - axis of the figure indicates the specific dates and the corresponding popularity score within the range of 0 ( m i n i m u m ) to 100 ( m a x i m u m ) has been shown in y - axis . According to Fig. Fig.1, 1 , the popularity indication values for these learning types are low in 2015 and are increasing day by day. These statistics motivate us to study on machine learning in this paper, which can play an important role in the real-world through Industry 4.0 automation.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig1_HTML.jpg

The worldwide popularity score of various types of ML algorithms (supervised, unsupervised, semi-supervised, and reinforcement) in a range of 0 (min) to 100 (max) over time where x-axis represents the timestamp information and y-axis represents the corresponding score

In general, the effectiveness and the efficiency of a machine learning solution depend on the nature and characteristics of data and the performance of the learning algorithms . In the area of machine learning algorithms, classification analysis, regression, data clustering, feature engineering and dimensionality reduction, association rule learning, or reinforcement learning techniques exist to effectively build data-driven systems [ 41 , 125 ]. Besides, deep learning originated from the artificial neural network that can be used to intelligently analyze data, which is known as part of a wider family of machine learning approaches [ 96 ]. Thus, selecting a proper learning algorithm that is suitable for the target application in a particular domain is challenging. The reason is that the purpose of different learning algorithms is different, even the outcome of different learning algorithms in a similar category may vary depending on the data characteristics [ 106 ]. Thus, it is important to understand the principles of various machine learning algorithms and their applicability to apply in various real-world application areas, such as IoT systems, cybersecurity services, business and recommendation systems, smart cities, healthcare and COVID-19, context-aware systems, sustainable agriculture, and many more that are explained briefly in Sect. “ Applications of Machine Learning ”.

Based on the importance and potentiality of “Machine Learning” to analyze the data mentioned above, in this paper, we provide a comprehensive view on various types of machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, the key contribution of this study is explaining the principles and potentiality of different machine learning techniques, and their applicability in various real-world application areas mentioned earlier. The purpose of this paper is, therefore, to provide a basic guide for those academia and industry people who want to study, research, and develop data-driven automated and intelligent systems in the relevant areas based on machine learning techniques.

The key contributions of this paper are listed as follows:

To define the scope of our study by taking into account the nature and characteristics of various types of real-world data and the capabilities of various learning techniques.
To provide a comprehensive view on machine learning algorithms that can be applied to enhance the intelligence and capabilities of a data-driven application.
To discuss the applicability of machine learning-based solutions in various real-world application domains.
To highlight and summarize the potential research directions within the scope of our study for intelligent data analysis and services.

The rest of the paper is organized as follows. The next section presents the types of data and machine learning algorithms in a broader sense and defines the scope of our study. We briefly discuss and explain different machine learning algorithms in the subsequent section followed by which various real-world application areas based on machine learning algorithms are discussed and summarized. In the penultimate section, we highlight several research issues and potential future directions, and the final section concludes this paper.

Types of Real-World Data and Machine Learning Techniques

Machine learning algorithms typically consume and process data to learn the related patterns about individuals, business processes, transactions, events, and so on. In the following, we discuss various types of real-world data as well as categories of machine learning algorithms.

Types of Real-World Data

Usually, the availability of data is considered as the key to construct a machine learning model or data-driven real-world systems [ 103 , 105 ]. Data can be of various forms, such as structured, semi-structured, or unstructured [ 41 , 72 ]. Besides, the “metadata” is another type that typically represents data about the data. In the following, we briefly discuss these types of data.

Structured: It has a well-defined structure, conforms to a data model following a standard order, which is highly organized and easily accessed, and used by an entity or a computer program. In well-defined schemes, such as relational databases, structured data are typically stored, i.e., in a tabular format. For instance, names, dates, addresses, credit card numbers, stock information, geolocation, etc. are examples of structured data.
Unstructured: On the other hand, there is no pre-defined format or organization for unstructured data, making it much more difficult to capture, process, and analyze, mostly containing text and multimedia material. For example, sensor data, emails, blog entries, wikis, and word processing documents, PDF files, audio files, videos, images, presentations, web pages, and many other types of business documents can be considered as unstructured data.
Semi-structured: Semi-structured data are not stored in a relational database like the structured data mentioned above, but it does have certain organizational properties that make it easier to analyze. HTML, XML, JSON documents, NoSQL databases, etc., are some examples of semi-structured data.
Metadata: It is not the normal form of data, but “data about data”. The primary difference between “data” and “metadata” is that data are simply the material that can classify, measure, or even document something relative to an organization’s data properties. On the other hand, metadata describes the relevant data information, giving it more significance for data users. A basic example of a document’s metadata might be the author, file size, date generated by the document, keywords to define the document, etc.

In the area of machine learning and data science, researchers use various widely used datasets for different purposes. These are, for example, cybersecurity datasets such as NSL-KDD [ 119 ], UNSW-NB15 [ 76 ], ISCX’12 [ 1 ], CIC-DDoS2019 [ 2 ], Bot-IoT [ 59 ], etc., smartphone datasets such as phone call logs [ 84 , 101 ], SMS Log [ 29 ], mobile application usages logs [ 137 ] [ 117 ], mobile phone notification logs [ 73 ] etc., IoT data [ 16 , 57 , 62 ], agriculture and e-commerce data [ 120 , 138 ], health data such as heart disease [ 92 ], diabetes mellitus [ 83 , 134 ], COVID-19 [ 43 , 74 ], etc., and many more in various application domains. The data can be in different types discussed above, which may vary from application to application in the real world. To analyze such data in a particular problem domain, and to extract the insights or useful knowledge from the data for building the real-world intelligent applications, different types of machine learning techniques can be used according to their learning capabilities, which is discussed in the following.

Types of Machine Learning Techniques

Machine Learning algorithms are mainly divided into four categories: Supervised learning, Unsupervised learning, Semi-supervised learning, and Reinforcement learning [ 75 ], as shown in Fig. Fig.2. 2 . In the following, we briefly discuss each type of learning technique with the scope of their applicability to solve real-world problems.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig2_HTML.jpg

Various types of machine learning techniques

Supervised: Supervised learning is typically the task of machine learning to learn a function that maps an input to an output based on sample input-output pairs [ 41 ]. It uses labeled training data and a collection of training examples to infer a function. Supervised learning is carried out when certain goals are identified to be accomplished from a certain set of inputs [ 105 ], i.e., a task-driven approach . The most common supervised tasks are “classification” that separates the data, and “regression” that fits the data. For instance, predicting the class label or sentiment of a piece of text, like a tweet or a product review, i.e., text classification, is an example of supervised learning.
Unsupervised: Unsupervised learning analyzes unlabeled datasets without the need for human interference, i.e., a data-driven process [ 41 ]. This is widely used for extracting generative features, identifying meaningful trends and structures, groupings in results, and exploratory purposes. The most common unsupervised learning tasks are clustering, density estimation, feature learning, dimensionality reduction, finding association rules, anomaly detection, etc.
Semi-supervised: Semi-supervised learning can be defined as a hybridization of the above-mentioned supervised and unsupervised methods, as it operates on both labeled and unlabeled data [ 41 , 105 ]. Thus, it falls between learning “without supervision” and learning “with supervision”. In the real world, labeled data could be rare in several contexts, and unlabeled data are numerous, where semi-supervised learning is useful [ 75 ]. The ultimate goal of a semi-supervised learning model is to provide a better outcome for prediction than that produced using the labeled data alone from the model. Some application areas where semi-supervised learning is used include machine translation, fraud detection, labeling data and text classification.
Reinforcement: Reinforcement learning is a type of machine learning algorithm that enables software agents and machines to automatically evaluate the optimal behavior in a particular context or environment to improve its efficiency [ 52 ], i.e., an environment-driven approach . This type of learning is based on reward or penalty, and its ultimate goal is to use insights obtained from environmental activists to take action to increase the reward or minimize the risk [ 75 ]. It is a powerful tool for training AI models that can help increase automation or optimize the operational efficiency of sophisticated systems such as robotics, autonomous driving tasks, manufacturing and supply chain logistics, however, not preferable to use it for solving the basic or straightforward problems.

Thus, to build effective models in various application areas different types of machine learning techniques can play a significant role according to their learning capabilities, depending on the nature of the data discussed earlier, and the target outcome. In Table Table1, 1 , we summarize various types of machine learning techniques with examples. In the following, we provide a comprehensive view of machine learning algorithms that can be applied to enhance the intelligence and capabilities of a data-driven application.

Various types of machine learning techniques with examples

Learning type	Model building	Examples
Supervised	Algorithms or models learn from labeled data (task-driven approach)	Classification, regression
Unsupervised	Algorithms or models learn from unlabeled data (Data-Driven Approach)	Clustering, associations, dimensionality reduction
Semi-supervised	Models are built using combined data (labeled + unlabeled)	Classification, clustering
Reinforcement	Models are based on reward or penalty (environment-driven approach)	Classification, control

Machine Learning Tasks and Algorithms

In this section, we discuss various machine learning algorithms that include classification analysis, regression analysis, data clustering, association rule learning, feature engineering for dimensionality reduction, as well as deep learning methods. A general structure of a machine learning-based predictive model has been shown in Fig. Fig.3, 3 , where the model is trained from historical data in phase 1 and the outcome is generated in phase 2 for the new test data.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig3_HTML.jpg

A general structure of a machine learning based predictive model considering both the training and testing phase

Classification Analysis

Classification is regarded as a supervised learning method in machine learning, referring to a problem of predictive modeling as well, where a class label is predicted for a given example [ 41 ]. Mathematically, it maps a function ( f ) from input variables ( X ) to output variables ( Y ) as target, label or categories. To predict the class of given data points, it can be carried out on structured or unstructured data. For example, spam detection such as “spam” and “not spam” in email service providers can be a classification problem. In the following, we summarize the common classification problems.

Binary classification: It refers to the classification tasks having two class labels such as “true and false” or “yes and no” [ 41 ]. In such binary classification tasks, one class could be the normal state, while the abnormal state could be another class. For instance, “cancer not detected” is the normal state of a task that involves a medical test, and “cancer detected” could be considered as the abnormal state. Similarly, “spam” and “not spam” in the above example of email service providers are considered as binary classification.
Multiclass classification: Traditionally, this refers to those classification tasks having more than two class labels [ 41 ]. The multiclass classification does not have the principle of normal and abnormal outcomes, unlike binary classification tasks. Instead, within a range of specified classes, examples are classified as belonging to one. For example, it can be a multiclass classification task to classify various types of network attacks in the NSL-KDD [ 119 ] dataset, where the attack categories are classified into four class labels, such as DoS (Denial of Service Attack), U2R (User to Root Attack), R2L (Root to Local Attack), and Probing Attack.
Multi-label classification: In machine learning, multi-label classification is an important consideration where an example is associated with several classes or labels. Thus, it is a generalization of multiclass classification, where the classes involved in the problem are hierarchically structured, and each example may simultaneously belong to more than one class in each hierarchical level, e.g., multi-level text classification. For instance, Google news can be presented under the categories of a “city name”, “technology”, or “latest news”, etc. Multi-label classification includes advanced machine learning algorithms that support predicting various mutually non-exclusive classes or labels, unlike traditional classification tasks where class labels are mutually exclusive [ 82 ].

Many classification algorithms have been proposed in the machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the most common and popular methods that are used widely in various application areas.

Naive Bayes (NB): The naive Bayes algorithm is based on the Bayes’ theorem with the assumption of independence between each pair of features [ 51 ]. It works well and can be used for both binary and multi-class categories in many real-world situations, such as document or text classification, spam filtering, etc. To effectively classify the noisy instances in the data and to construct a robust prediction model, the NB classifier can be used [ 94 ]. The key benefit is that, compared to more sophisticated approaches, it needs a small amount of training data to estimate the necessary parameters and quickly [ 82 ]. However, its performance may affect due to its strong assumptions on features independence. Gaussian, Multinomial, Complement, Bernoulli, and Categorical are the common variants of NB classifier [ 82 ].
Linear Discriminant Analysis (LDA): Linear Discriminant Analysis (LDA) is a linear decision boundary classifier created by fitting class conditional densities to data and applying Bayes’ rule [ 51 , 82 ]. This method is also known as a generalization of Fisher’s linear discriminant, which projects a given dataset into a lower-dimensional space, i.e., a reduction of dimensionality that minimizes the complexity of the model or reduces the resulting model’s computational costs. The standard LDA model usually suits each class with a Gaussian density, assuming that all classes share the same covariance matrix [ 82 ]. LDA is closely related to ANOVA (analysis of variance) and regression analysis, which seek to express one dependent variable as a linear combination of other features or measurements.
Logistic regression (LR): Another common probabilistic based statistical model used to solve classification issues in machine learning is Logistic Regression (LR) [ 64 ]. Logistic regression typically uses a logistic function to estimate the probabilities, which is also referred to as the mathematically defined sigmoid function in Eq. 1 . It can overfit high-dimensional datasets and works well when the dataset can be separated linearly. The regularization (L1 and L2) techniques [ 82 ] can be used to avoid over-fitting in such scenarios. The assumption of linearity between the dependent and independent variables is considered as a major drawback of Logistic Regression. It can be used for both classification and regression problems, but it is more commonly used for classification. g ( z ) = 1 1 + exp ( - z ) . 1
K-nearest neighbors (KNN): K-Nearest Neighbors (KNN) [ 9 ] is an “instance-based learning” or non-generalizing learning, also known as a “lazy learning” algorithm. It does not focus on constructing a general internal model; instead, it stores all instances corresponding to training data in n -dimensional space. KNN uses data and classifies new data points based on similarity measures (e.g., Euclidean distance function) [ 82 ]. Classification is computed from a simple majority vote of the k nearest neighbors of each point. It is quite robust to noisy training data, and accuracy depends on the data quality. The biggest issue with KNN is to choose the optimal number of neighbors to be considered. KNN can be used both for classification as well as regression.
Support vector machine (SVM): In machine learning, another common technique that can be used for classification, regression, or other tasks is a support vector machine (SVM) [ 56 ]. In high- or infinite-dimensional space, a support vector machine constructs a hyper-plane or set of hyper-planes. Intuitively, the hyper-plane, which has the greatest distance from the nearest training data points in any class, achieves a strong separation since, in general, the greater the margin, the lower the classifier’s generalization error. It is effective in high-dimensional spaces and can behave differently based on different mathematical functions known as the kernel. Linear, polynomial, radial basis function (RBF), sigmoid, etc., are the popular kernel functions used in SVM classifier [ 82 ]. However, when the data set contains more noise, such as overlapping target classes, SVM does not perform well.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig4_HTML.jpg

An example of a decision tree structure

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig5_HTML.jpg

An example of a random forest structure considering multiple decision trees

Adaptive Boosting (AdaBoost): Adaptive Boosting (AdaBoost) is an ensemble learning process that employs an iterative approach to improve poor classifiers by learning from their errors. This is developed by Yoav Freund et al. [ 35 ] and also known as “meta-learning”. Unlike the random forest that uses parallel ensembling, Adaboost uses “sequential ensembling”. It creates a powerful classifier by combining many poorly performing classifiers to obtain a good classifier of high accuracy. In that sense, AdaBoost is called an adaptive classifier by significantly improving the efficiency of the classifier, but in some instances, it can trigger overfits. AdaBoost is best used to boost the performance of decision trees, base estimator [ 82 ], on binary classification problems, however, is sensitive to noisy data and outliers.
Extreme gradient boosting (XGBoost): Gradient Boosting, like Random Forests [ 19 ] above, is an ensemble learning algorithm that generates a final model based on a series of individual models, typically decision trees. The gradient is used to minimize the loss function, similar to how neural networks [ 41 ] use gradient descent to optimize weights. Extreme Gradient Boosting (XGBoost) is a form of gradient boosting that takes more detailed approximations into account when determining the best model [ 82 ]. It computes second-order gradients of the loss function to minimize loss and advanced regularization (L1 and L2) [ 82 ], which reduces over-fitting, and improves model generalization and performance. XGBoost is fast to interpret and can handle large-sized datasets well.
Stochastic gradient descent (SGD): Stochastic gradient descent (SGD) [ 41 ] is an iterative method for optimizing an objective function with appropriate smoothness properties, where the word ‘stochastic’ refers to random probability. This reduces the computational burden, particularly in high-dimensional optimization problems, allowing for faster iterations in exchange for a lower convergence rate. A gradient is the slope of a function that calculates a variable’s degree of change in response to another variable’s changes. Mathematically, the Gradient Descent is a convex function whose output is a partial derivative of a set of its input parameters. Let, α is the learning rate, and J i is the training example cost of i th , then Eq. ( 4 ) represents the stochastic gradient descent weight update method at the j th iteration. In large-scale and sparse machine learning, SGD has been successfully applied to problems often encountered in text classification and natural language processing [ 82 ]. However, SGD is sensitive to feature scaling and needs a range of hyperparameters, such as the regularization parameter and the number of iterations. w j : = w j - α ∂ J i ∂ w j . 4
Rule-based classification : The term rule-based classification can be used to refer to any classification scheme that makes use of IF-THEN rules for class prediction. Several classification algorithms such as Zero-R [ 125 ], One-R [ 47 ], decision trees [ 87 , 88 ], DTNB [ 110 ], Ripple Down Rule learner (RIDOR) [ 125 ], Repeated Incremental Pruning to Produce Error Reduction (RIPPER) [ 126 ] exist with the ability of rule generation. The decision tree is one of the most common rule-based classification algorithms among these techniques because it has several advantages, such as being easier to interpret; the ability to handle high-dimensional data; simplicity and speed; good accuracy; and the capability to produce rules for human clear and understandable classification [ 127 ] [ 128 ]. The decision tree-based rules also provide significant accuracy in a prediction model for unseen test cases [ 106 ]. Since the rules are easily interpretable, these rule-based classifiers are often used to produce descriptive models that can describe a system including the entities and their relationships.

Regression Analysis

Regression analysis includes several methods of machine learning that allow to predict a continuous ( y ) result variable based on the value of one or more ( x ) predictor variables [ 41 ]. The most significant distinction between classification and regression is that classification predicts distinct class labels, while regression facilitates the prediction of a continuous quantity. Figure Figure6 6 shows an example of how classification is different with regression models. Some overlaps are often found between the two types of machine learning algorithms. Regression models are now widely used in a variety of fields, including financial forecasting or prediction, cost estimation, trend analysis, marketing, time series estimation, drug response modeling, and many more. Some of the familiar types of regression algorithms are linear, polynomial, lasso and ridge regression, etc., which are explained briefly in the following.

Simple and multiple linear regression: This is one of the most popular ML modeling techniques as well as a well-known regression technique. In this technique, the dependent variable is continuous, the independent variable(s) can be continuous or discrete, and the form of the regression line is linear. Linear regression creates a relationship between the dependent variable ( Y ) and one or more independent variables ( X ) (also known as regression line) using the best fit straight line [ 41 ]. It is defined by the following equations: y = a + b x + e 5 y = a + b 1 x 1 + b 2 x 2 + ⋯ + b n x n + e , 6 where a is the intercept, b is the slope of the line, and e is the error term. This equation can be used to predict the value of the target variable based on the given predictor variable(s). Multiple linear regression is an extension of simple linear regression that allows two or more predictor variables to model a response variable, y, as a linear function [ 41 ] defined in Eq. 6 , whereas simple linear regression has only 1 independent variable, defined in Eq. 5 .
Polynomial regression: Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is not linear, but is the polynomial degree of n th in x [ 82 ]. The equation for polynomial regression is also derived from linear regression (polynomial regression of degree 1) equation, which is defined as below: y = b 0 + b 1 x + b 2 x 2 + b 3 x 3 + ⋯ + b n x n + e . 7 Here, y is the predicted/target output, b 0 , b 1 , . . . b n are the regression coefficients, x is an independent/ input variable. In simple words, we can say that if data are not distributed linearly, instead it is n th degree of polynomial then we use polynomial regression to get desired output.
LASSO and ridge regression: LASSO and Ridge regression are well known as powerful techniques which are typically used for building learning models in presence of a large number of features, due to their capability to preventing over-fitting and reducing the complexity of the model. The LASSO (least absolute shrinkage and selection operator) regression model uses L 1 regularization technique [ 82 ] that uses shrinkage, which penalizes “absolute value of magnitude of coefficients” ( L 1 penalty). As a result, LASSO appears to render coefficients to absolute zero. Thus, LASSO regression aims to find the subset of predictors that minimizes the prediction error for a quantitative response variable. On the other hand, ridge regression uses L 2 regularization [ 82 ], which is the “squared magnitude of coefficients” ( L 2 penalty). Thus, ridge regression forces the weights to be small but never sets the coefficient value to zero, and does a non-sparse solution. Overall, LASSO regression is useful to obtain a subset of predictors by eliminating less important features, and ridge regression is useful when a data set has “multicollinearity” which refers to the predictors that are correlated with other predictors.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig6_HTML.jpg

Classification vs. regression. In classification the dotted line represents a linear boundary that separates the two classes; in regression, the dotted line models the linear relationship between the two variables

Cluster Analysis

Cluster analysis, also known as clustering, is an unsupervised machine learning technique for identifying and grouping related data points in large datasets without concern for the specific outcome. It does grouping a collection of objects in such a way that objects in the same category, called a cluster, are in some sense more similar to each other than objects in other groups [ 41 ]. It is often used as a data analysis technique to discover interesting trends or patterns in data, e.g., groups of consumers based on their behavior. In a broad range of application areas, such as cybersecurity, e-commerce, mobile data processing, health analytics, user modeling and behavioral analytics, clustering can be used. In the following, we briefly discuss and summarize various types of clustering methods.

Partitioning methods: Based on the features and similarities in the data, this clustering approach categorizes the data into multiple groups or clusters. The data scientists or analysts typically determine the number of clusters either dynamically or statically depending on the nature of the target applications, to produce for the methods of clustering. The most common clustering algorithms based on partitioning methods are K-means [ 69 ], K-Mediods [ 80 ], CLARA [ 55 ] etc.
Density-based methods: To identify distinct groups or clusters, it uses the concept that a cluster in the data space is a contiguous region of high point density isolated from other such clusters by contiguous regions of low point density. Points that are not part of a cluster are considered as noise. The typical clustering algorithms based on density are DBSCAN [ 32 ], OPTICS [ 12 ] etc. The density-based methods typically struggle with clusters of similar density and high dimensionality data.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig7_HTML.jpg

A graphical interpretation of the widely-used hierarchical clustering (Bottom-up and top-down) technique

Grid-based methods: To deal with massive datasets, grid-based clustering is especially suitable. To obtain clusters, the principle is first to summarize the dataset with a grid representation and then to combine grid cells. STING [ 122 ], CLIQUE [ 6 ], etc. are the standard algorithms of grid-based clustering.
Model-based methods: There are mainly two types of model-based clustering algorithms: one that uses statistical learning, and the other based on a method of neural network learning [ 130 ]. For instance, GMM [ 89 ] is an example of a statistical learning method, and SOM [ 22 ] [ 96 ] is an example of a neural network learning method.
Constraint-based methods: Constrained-based clustering is a semi-supervised approach to data clustering that uses constraints to incorporate domain knowledge. Application or user-oriented constraints are incorporated to perform the clustering. The typical algorithms of this kind of clustering are COP K-means [ 121 ], CMWK-Means [ 27 ], etc.

Many clustering algorithms have been proposed with the ability to grouping data in machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the popular methods that are used widely in various application areas.

K-means clustering: K-means clustering [ 69 ] is a fast, robust, and simple algorithm that provides reliable results when data sets are well-separated from each other. The data points are allocated to a cluster in this algorithm in such a way that the amount of the squared distance between the data points and the centroid is as small as possible. In other words, the K-means algorithm identifies the k number of centroids and then assigns each data point to the nearest cluster while keeping the centroids as small as possible. Since it begins with a random selection of cluster centers, the results can be inconsistent. Since extreme values can easily affect a mean, the K-means clustering algorithm is sensitive to outliers. K-medoids clustering [ 91 ] is a variant of K-means that is more robust to noises and outliers.
Mean-shift clustering: Mean-shift clustering [ 37 ] is a nonparametric clustering technique that does not require prior knowledge of the number of clusters or constraints on cluster shape. Mean-shift clustering aims to discover “blobs” in a smooth distribution or density of samples [ 82 ]. It is a centroid-based algorithm that works by updating centroid candidates to be the mean of the points in a given region. To form the final set of centroids, these candidates are filtered in a post-processing stage to remove near-duplicates. Cluster analysis in computer vision and image processing are examples of application domains. Mean Shift has the disadvantage of being computationally expensive. Moreover, in cases of high dimension, where the number of clusters shifts abruptly, the mean-shift algorithm does not work well.
DBSCAN: Density-based spatial clustering of applications with noise (DBSCAN) [ 32 ] is a base algorithm for density-based clustering which is widely used in data mining and machine learning. This is known as a non-parametric density-based clustering technique for separating high-density clusters from low-density clusters that are used in model building. DBSCAN’s main idea is that a point belongs to a cluster if it is close to many points from that cluster. It can find clusters of various shapes and sizes in a vast volume of data that is noisy and contains outliers. DBSCAN, unlike k-means, does not require a priori specification of the number of clusters in the data and can find arbitrarily shaped clusters. Although k-means is much faster than DBSCAN, it is efficient at finding high-density regions and outliers, i.e., is robust to outliers.
GMM clustering: Gaussian mixture models (GMMs) are often used for data clustering, which is a distribution-based clustering algorithm. A Gaussian mixture model is a probabilistic model in which all the data points are produced by a mixture of a finite number of Gaussian distributions with unknown parameters [ 82 ]. To find the Gaussian parameters for each cluster, an optimization algorithm called expectation-maximization (EM) [ 82 ] can be used. EM is an iterative method that uses a statistical model to estimate the parameters. In contrast to k-means, Gaussian mixture models account for uncertainty and return the likelihood that a data point belongs to one of the k clusters. GMM clustering is more robust than k-means and works well even with non-linear data distributions.
Agglomerative hierarchical clustering: The most common method of hierarchical clustering used to group objects in clusters based on their similarity is agglomerative clustering. This technique uses a bottom-up approach, where each object is first treated as a singleton cluster by the algorithm. Following that, pairs of clusters are merged one by one until all clusters have been merged into a single large cluster containing all objects. The result is a dendrogram, which is a tree-based representation of the elements. Single linkage [ 115 ], Complete linkage [ 116 ], BOTS [ 102 ] etc. are some examples of such techniques. The main advantage of agglomerative hierarchical clustering over k-means is that the tree-structure hierarchy generated by agglomerative clustering is more informative than the unstructured collection of flat clusters returned by k-means, which can help to make better decisions in the relevant application areas.

Dimensionality Reduction and Feature Learning

In machine learning and data science, high-dimensional data processing is a challenging task for both researchers and application developers. Thus, dimensionality reduction which is an unsupervised learning technique, is important because it leads to better human interpretations, lower computational costs, and avoids overfitting and redundancy by simplifying models. Both the process of feature selection and feature extraction can be used for dimensionality reduction. The primary distinction between the selection and extraction of features is that the “feature selection” keeps a subset of the original features [ 97 ], while “feature extraction” creates brand new ones [ 98 ]. In the following, we briefly discuss these techniques.

Feature selection: The selection of features, also known as the selection of variables or attributes in the data, is the process of choosing a subset of unique features (variables, predictors) to use in building machine learning and data science model. It decreases a model’s complexity by eliminating the irrelevant or less important features and allows for faster training of machine learning algorithms. A right and optimal subset of the selected features in a problem domain is capable to minimize the overfitting problem through simplifying and generalizing the model as well as increases the model’s accuracy [ 97 ]. Thus, “feature selection” [ 66 , 99 ] is considered as one of the primary concepts in machine learning that greatly affects the effectiveness and efficiency of the target machine learning model. Chi-squared test, Analysis of variance (ANOVA) test, Pearson’s correlation coefficient, recursive feature elimination, are some popular techniques that can be used for feature selection.
Feature extraction: In a machine learning-based model or system, feature extraction techniques usually provide a better understanding of the data, a way to improve prediction accuracy, and to reduce computational cost or training time. The aim of “feature extraction” [ 66 , 99 ] is to reduce the number of features in a dataset by generating new ones from the existing ones and then discarding the original features. The majority of the information found in the original set of features can then be summarized using this new reduced set of features. For instance, principal components analysis (PCA) is often used as a dimensionality-reduction technique to extract a lower-dimensional space creating new brand components from the existing features in a dataset [ 98 ].

Many algorithms have been proposed to reduce data dimensions in the machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the popular methods that are used widely in various application areas.

Variance threshold: A simple basic approach to feature selection is the variance threshold [ 82 ]. This excludes all features of low variance, i.e., all features whose variance does not exceed the threshold. It eliminates all zero-variance characteristics by default, i.e., characteristics that have the same value in all samples. This feature selection algorithm looks only at the ( X ) features, not the ( y ) outputs needed, and can, therefore, be used for unsupervised learning.
Pearson correlation: Pearson’s correlation is another method to understand a feature’s relation to the response variable and can be used for feature selection [ 99 ]. This method is also used for finding the association between the features in a dataset. The resulting value is [ - 1 , 1 ] , where - 1 means perfect negative correlation, + 1 means perfect positive correlation, and 0 means that the two variables do not have a linear correlation. If two random variables represent X and Y , then the correlation coefficient between X and Y is defined as [ 41 ] r ( X , Y ) = ∑ i = 1 n ( X i - X ¯ ) ( Y i - Y ¯ ) ∑ i = 1 n ( X i - X ¯ ) 2 ∑ i = 1 n ( Y i - Y ¯ ) 2 . 8
ANOVA: Analysis of variance (ANOVA) is a statistical tool used to verify the mean values of two or more groups that differ significantly from each other. ANOVA assumes a linear relationship between the variables and the target and the variables’ normal distribution. To statistically test the equality of means, the ANOVA method utilizes F tests. For feature selection, the results ‘ANOVA F value’ [ 82 ] of this test can be used where certain features independent of the goal variable can be omitted.
Chi square: The chi-square χ 2 [ 82 ] statistic is an estimate of the difference between the effects of a series of events or variables observed and expected frequencies. The magnitude of the difference between the real and observed values, the degrees of freedom, and the sample size depends on χ 2 . The chi-square χ 2 is commonly used for testing relationships between categorical variables. If O i represents observed value and E i represents expected value, then χ 2 = ∑ i = 1 n ( O i - E i ) 2 E i . 9
Recursive feature elimination (RFE): Recursive Feature Elimination (RFE) is a brute force approach to feature selection. RFE [ 82 ] fits the model and removes the weakest feature before it meets the specified number of features. Features are ranked by the coefficients or feature significance of the model. RFE aims to remove dependencies and collinearity in the model by recursively removing a small number of features per iteration.
Model-based selection: To reduce the dimensionality of the data, linear models penalized with the L 1 regularization can be used. Least absolute shrinkage and selection operator (Lasso) regression is a type of linear regression that has the property of shrinking some of the coefficients to zero [ 82 ]. Therefore, that feature can be removed from the model. Thus, the penalized lasso regression method, often used in machine learning to select the subset of variables. Extra Trees Classifier [ 82 ] is an example of a tree-based estimator that can be used to compute impurity-based function importance, which can then be used to discard irrelevant features.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig8_HTML.jpg

An example of a principal component analysis (PCA) and created principal components PC1 and PC2 in different dimension space

Association Rule Learning

Association rule learning is a rule-based machine learning approach to discover interesting relationships, “IF-THEN” statements, in large datasets between variables [ 7 ]. One example is that “if a customer buys a computer or laptop (an item), s/he is likely to also buy anti-virus software (another item) at the same time”. Association rules are employed today in many application areas, including IoT services, medical diagnosis, usage behavior analytics, web usage mining, smartphone applications, cybersecurity applications, and bioinformatics. In comparison to sequence mining, association rule learning does not usually take into account the order of things within or across transactions. A common way of measuring the usefulness of association rules is to use its parameter, the ‘support’ and ‘confidence’, which is introduced in [ 7 ].

In the data mining literature, many association rule learning methods have been proposed, such as logic dependent [ 34 ], frequent pattern based [ 8 , 49 , 68 ], and tree-based [ 42 ]. The most popular association rule learning algorithms are summarized below.

AIS and SETM: AIS is the first algorithm proposed by Agrawal et al. [ 7 ] for association rule mining. The AIS algorithm’s main downside is that too many candidate itemsets are generated, requiring more space and wasting a lot of effort. This algorithm calls for too many passes over the entire dataset to produce the rules. Another approach SETM [ 49 ] exhibits good performance and stable behavior with execution time; however, it suffers from the same flaw as the AIS algorithm.
Apriori: For generating association rules for a given dataset, Agrawal et al. [ 8 ] proposed the Apriori, Apriori-TID, and Apriori-Hybrid algorithms. These later algorithms outperform the AIS and SETM mentioned above due to the Apriori property of frequent itemset [ 8 ]. The term ‘Apriori’ usually refers to having prior knowledge of frequent itemset properties. Apriori uses a “bottom-up” approach, where it generates the candidate itemsets. To reduce the search space, Apriori uses the property “all subsets of a frequent itemset must be frequent; and if an itemset is infrequent, then all its supersets must also be infrequent”. Another approach predictive Apriori [ 108 ] can also generate rules; however, it receives unexpected results as it combines both the support and confidence. The Apriori [ 8 ] is the widely applicable techniques in mining association rules.
ECLAT: This technique was proposed by Zaki et al. [ 131 ] and stands for Equivalence Class Clustering and bottom-up Lattice Traversal. ECLAT uses a depth-first search to find frequent itemsets. In contrast to the Apriori [ 8 ] algorithm, which represents data in a horizontal pattern, it represents data vertically. Hence, the ECLAT algorithm is more efficient and scalable in the area of association rule learning. This algorithm is better suited for small and medium datasets whereas the Apriori algorithm is used for large datasets.
FP-Growth: Another common association rule learning technique based on the frequent-pattern tree (FP-tree) proposed by Han et al. [ 42 ] is Frequent Pattern Growth, known as FP-Growth. The key difference with Apriori is that while generating rules, the Apriori algorithm [ 8 ] generates frequent candidate itemsets; on the other hand, the FP-growth algorithm [ 42 ] prevents candidate generation and thus produces a tree by the successful strategy of ‘divide and conquer’ approach. Due to its sophistication, however, FP-Tree is challenging to use in an interactive mining environment [ 133 ]. Thus, the FP-Tree would not fit into memory for massive data sets, making it challenging to process big data as well. Another solution is RARM (Rapid Association Rule Mining) proposed by Das et al. [ 26 ] but faces a related FP-tree issue [ 133 ].
ABC-RuleMiner: A rule-based machine learning method, recently proposed in our earlier paper, by Sarker et al. [ 104 ], to discover the interesting non-redundant rules to provide real-world intelligent services. This algorithm effectively identifies the redundancy in associations by taking into account the impact or precedence of the related contextual features and discovers a set of non-redundant association rules. This algorithm first constructs an association generation tree (AGT), a top-down approach, and then extracts the association rules through traversing the tree. Thus, ABC-RuleMiner is more potent than traditional rule-based methods in terms of both non-redundant rule generation and intelligent decision-making, particularly in a context-aware smart computing environment, where human or user preferences are involved.

Among the association rule learning techniques discussed above, Apriori [ 8 ] is the most widely used algorithm for discovering association rules from a given dataset [ 133 ]. The main strength of the association learning technique is its comprehensiveness, as it generates all associations that satisfy the user-specified constraints, such as minimum support and confidence value. The ABC-RuleMiner approach [ 104 ] discussed earlier could give significant results in terms of non-redundant rule generation and intelligent decision-making for the relevant application areas in the real world.

Reinforcement Learning

Reinforcement learning (RL) is a machine learning technique that allows an agent to learn by trial and error in an interactive environment using input from its actions and experiences. Unlike supervised learning, which is based on given sample data or examples, the RL method is based on interacting with the environment. The problem to be solved in reinforcement learning (RL) is defined as a Markov Decision Process (MDP) [ 86 ], i.e., all about sequentially making decisions. An RL problem typically includes four elements such as Agent, Environment, Rewards, and Policy.

RL can be split roughly into Model-based and Model-free techniques. Model-based RL is the process of inferring optimal behavior from a model of the environment by performing actions and observing the results, which include the next state and the immediate reward [ 85 ]. AlphaZero, AlphaGo [ 113 ] are examples of the model-based approaches. On the other hand, a model-free approach does not use the distribution of the transition probability and the reward function associated with MDP. Q-learning, Deep Q Network, Monte Carlo Control, SARSA (State–Action–Reward–State–Action), etc. are some examples of model-free algorithms [ 52 ]. The policy network, which is required for model-based RL but not for model-free, is the key difference between model-free and model-based learning. In the following, we discuss the popular RL algorithms.

Monte Carlo methods: Monte Carlo techniques, or Monte Carlo experiments, are a wide category of computational algorithms that rely on repeated random sampling to obtain numerical results [ 52 ]. The underlying concept is to use randomness to solve problems that are deterministic in principle. Optimization, numerical integration, and making drawings from the probability distribution are the three problem classes where Monte Carlo techniques are most commonly used.
Q-learning: Q-learning is a model-free reinforcement learning algorithm for learning the quality of behaviors that tell an agent what action to take under what conditions [ 52 ]. It does not need a model of the environment (hence the term “model-free”), and it can deal with stochastic transitions and rewards without the need for adaptations. The ‘Q’ in Q-learning usually stands for quality, as the algorithm calculates the maximum expected rewards for a given behavior in a given state.
Deep Q-learning: The basic working step in Deep Q-Learning [ 52 ] is that the initial state is fed into the neural network, which returns the Q-value of all possible actions as an output. Still, when we have a reasonably simple setting to overcome, Q-learning works well. However, when the number of states and actions becomes more complicated, deep learning can be used as a function approximator.

Reinforcement learning, along with supervised and unsupervised learning, is one of the basic machine learning paradigms. RL can be used to solve numerous real-world problems in various fields, such as game theory, control theory, operations analysis, information theory, simulation-based optimization, manufacturing, supply chain logistics, multi-agent systems, swarm intelligence, aircraft control, robot motion control, and many more.

Artificial Neural Network and Deep Learning

Deep learning is part of a wider family of artificial neural networks (ANN)-based machine learning approaches with representation learning. Deep learning provides a computational architecture by combining several processing layers, such as input, hidden, and output layers, to learn from data [ 41 ]. The main advantage of deep learning over traditional machine learning methods is its better performance in several cases, particularly learning from large datasets [ 105 , 129 ]. Figure Figure9 9 shows a general performance of deep learning over machine learning considering the increasing amount of data. However, it may vary depending on the data characteristics and experimental set up.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig9_HTML.jpg

Machine learning and deep learning performance in general with the amount of data

The most common deep learning algorithms are: Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN, or ConvNet), Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) [ 96 ]. In the following, we discuss various types of deep learning methods that can be used to build effective data-driven models for various purposes.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig10_HTML.jpg

A structure of an artificial neural network modeling with multiple processing layers

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_592_Fig11_HTML.jpg

An example of a convolutional neural network (CNN or ConvNet) including multiple convolution and pooling layers

LSTM-RNN: Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the area of deep learning [ 38 ]. LSTM has feedback links, unlike normal feed-forward neural networks. LSTM networks are well-suited for analyzing and learning sequential data, such as classifying, processing, and predicting data based on time series data, which differentiates it from other conventional networks. Thus, LSTM can be used when the data are in a sequential format, such as time, sentence, etc., and commonly applied in the area of time-series analysis, natural language processing, speech recognition, etc.

In addition to these most common deep learning methods discussed above, several other deep learning approaches [ 96 ] exist in the area for various purposes. For instance, the self-organizing map (SOM) [ 58 ] uses unsupervised learning to represent the high-dimensional data by a 2D grid map, thus achieving dimensionality reduction. The autoencoder (AE) [ 15 ] is another learning technique that is widely used for dimensionality reduction as well and feature extraction in unsupervised learning tasks. Restricted Boltzmann machines (RBM) [ 46 ] can be used for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling. A deep belief network (DBN) is typically composed of simple, unsupervised networks such as restricted Boltzmann machines (RBMs) or autoencoders, and a backpropagation neural network (BPNN) [ 123 ]. A generative adversarial network (GAN) [ 39 ] is a form of the network for deep learning that can generate data with characteristics close to the actual data input. Transfer learning is currently very common because it can train deep neural networks with comparatively low data, which is typically the re-use of a new problem with a pre-trained model [ 124 ]. A brief discussion of these artificial neural networks (ANN) and deep learning (DL) models are summarized in our earlier paper Sarker et al. [ 96 ].

Overall, based on the learning techniques discussed above, we can conclude that various types of machine learning techniques, such as classification analysis, regression, data clustering, feature selection and extraction, and dimensionality reduction, association rule learning, reinforcement learning, or deep learning techniques, can play a significant role for various purposes according to their capabilities. In the following section, we discuss several application areas based on machine learning algorithms.

Applications of Machine Learning

In the current age of the Fourth Industrial Revolution (4IR), machine learning becomes popular in various application areas, because of its learning capabilities from the past and making intelligent decisions. In the following, we summarize and discuss ten popular application areas of machine learning technology.

Predictive analytics and intelligent decision-making: A major application field of machine learning is intelligent decision-making by data-driven predictive analytics [ 21 , 70 ]. The basis of predictive analytics is capturing and exploiting relationships between explanatory variables and predicted variables from previous events to predict the unknown outcome [ 41 ]. For instance, identifying suspects or criminals after a crime has been committed, or detecting credit card fraud as it happens. Another application, where machine learning algorithms can assist retailers in better understanding consumer preferences and behavior, better manage inventory, avoiding out-of-stock situations, and optimizing logistics and warehousing in e-commerce. Various machine learning algorithms such as decision trees, support vector machines, artificial neural networks, etc. [ 106 , 125 ] are commonly used in the area. Since accurate predictions provide insight into the unknown, they can improve the decisions of industries, businesses, and almost any organization, including government agencies, e-commerce, telecommunications, banking and financial services, healthcare, sales and marketing, transportation, social networking, and many others.
Cybersecurity and threat intelligence: Cybersecurity is one of the most essential areas of Industry 4.0. [ 114 ], which is typically the practice of protecting networks, systems, hardware, and data from digital attacks [ 114 ]. Machine learning has become a crucial cybersecurity technology that constantly learns by analyzing data to identify patterns, better detect malware in encrypted traffic, find insider threats, predict where bad neighborhoods are online, keep people safe while browsing, or secure data in the cloud by uncovering suspicious activity. For instance, clustering techniques can be used to identify cyber-anomalies, policy violations, etc. To detect various types of cyber-attacks or intrusions machine learning classification models by taking into account the impact of security features are useful [ 97 ]. Various deep learning-based security models can also be used on the large scale of security datasets [ 96 , 129 ]. Moreover, security policy rules generated by association rule learning techniques can play a significant role to build a rule-based security system [ 105 ]. Thus, we can say that various learning techniques discussed in Sect. Machine Learning Tasks and Algorithms , can enable cybersecurity professionals to be more proactive inefficiently preventing threats and cyber-attacks.
Internet of things (IoT) and smart cities: Internet of Things (IoT) is another essential area of Industry 4.0. [ 114 ], which turns everyday objects into smart objects by allowing them to transmit data and automate tasks without the need for human interaction. IoT is, therefore, considered to be the big frontier that can enhance almost all activities in our lives, such as smart governance, smart home, education, communication, transportation, retail, agriculture, health care, business, and many more [ 70 ]. Smart city is one of IoT’s core fields of application, using technologies to enhance city services and residents’ living experiences [ 132 , 135 ]. As machine learning utilizes experience to recognize trends and create models that help predict future behavior and events, it has become a crucial technology for IoT applications [ 103 ]. For example, to predict traffic in smart cities, parking availability prediction, estimate the total usage of energy of the citizens for a particular period, make context-aware and timely decisions for the people, etc. are some tasks that can be solved using machine learning techniques according to the current needs of the people.
Traffic prediction and transportation: Transportation systems have become a crucial component of every country’s economic development. Nonetheless, several cities around the world are experiencing an excessive rise in traffic volume, resulting in serious issues such as delays, traffic congestion, higher fuel prices, increased CO 2 pollution, accidents, emergencies, and a decline in modern society’s quality of life [ 40 ]. Thus, an intelligent transportation system through predicting future traffic is important, which is an indispensable part of a smart city. Accurate traffic prediction based on machine and deep learning modeling can help to minimize the issues [ 17 , 30 , 31 ]. For example, based on the travel history and trend of traveling through various routes, machine learning can assist transportation companies in predicting possible issues that may occur on specific routes and recommending their customers to take a different path. Ultimately, these learning-based data-driven models help improve traffic flow, increase the usage and efficiency of sustainable modes of transportation, and limit real-world disruption by modeling and visualizing future changes.
Healthcare and COVID-19 pandemic: Machine learning can help to solve diagnostic and prognostic problems in a variety of medical domains, such as disease prediction, medical knowledge extraction, detecting regularities in data, patient management, etc. [ 33 , 77 , 112 ]. Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus, according to the World Health Organization (WHO) [ 3 ]. Recently, the learning techniques have become popular in the battle against COVID-19 [ 61 , 63 ]. For the COVID-19 pandemic, the learning techniques are used to classify patients at high risk, their mortality rate, and other anomalies [ 61 ]. It can also be used to better understand the virus’s origin, COVID-19 outbreak prediction, as well as for disease diagnosis and treatment [ 14 , 50 ]. With the help of machine learning, researchers can forecast where and when, the COVID-19 is likely to spread, and notify those regions to match the required arrangements. Deep learning also provides exciting solutions to the problems of medical image processing and is seen as a crucial technique for potential applications, particularly for COVID-19 pandemic [ 10 , 78 , 111 ]. Overall, machine and deep learning techniques can help to fight the COVID-19 virus and the pandemic as well as intelligent clinical decisions making in the domain of healthcare.
E-commerce and product recommendations: Product recommendation is one of the most well known and widely used applications of machine learning, and it is one of the most prominent features of almost any e-commerce website today. Machine learning technology can assist businesses in analyzing their consumers’ purchasing histories and making customized product suggestions for their next purchase based on their behavior and preferences. E-commerce companies, for example, can easily position product suggestions and offers by analyzing browsing trends and click-through rates of specific items. Using predictive modeling based on machine learning techniques, many online retailers, such as Amazon [ 71 ], can better manage inventory, prevent out-of-stock situations, and optimize logistics and warehousing. The future of sales and marketing is the ability to capture, evaluate, and use consumer data to provide a customized shopping experience. Furthermore, machine learning techniques enable companies to create packages and content that are tailored to the needs of their customers, allowing them to maintain existing customers while attracting new ones.
NLP and sentiment analysis: Natural language processing (NLP) involves the reading and understanding of spoken or written language through the medium of a computer [ 79 , 103 ]. Thus, NLP helps computers, for instance, to read a text, hear speech, interpret it, analyze sentiment, and decide which aspects are significant, where machine learning techniques can be used. Virtual personal assistant, chatbot, speech recognition, document description, language or machine translation, etc. are some examples of NLP-related tasks. Sentiment Analysis [ 90 ] (also referred to as opinion mining or emotion AI) is an NLP sub-field that seeks to identify and extract public mood and views within a given text through blogs, reviews, social media, forums, news, etc. For instance, businesses and brands use sentiment analysis to understand the social sentiment of their brand, product, or service through social media platforms or the web as a whole. Overall, sentiment analysis is considered as a machine learning task that analyzes texts for polarity, such as “positive”, “negative”, or “neutral” along with more intense emotions like very happy, happy, sad, very sad, angry, have interest, or not interested etc.
Image, speech and pattern recognition: Image recognition [ 36 ] is a well-known and widespread example of machine learning in the real world, which can identify an object as a digital image. For instance, to label an x-ray as cancerous or not, character recognition, or face detection in an image, tagging suggestions on social media, e.g., Facebook, are common examples of image recognition. Speech recognition [ 23 ] is also very popular that typically uses sound and linguistic models, e.g., Google Assistant, Cortana, Siri, Alexa, etc. [ 67 ], where machine learning methods are used. Pattern recognition [ 13 ] is defined as the automated recognition of patterns and regularities in data, e.g., image analysis. Several machine learning techniques such as classification, feature selection, clustering, or sequence labeling methods are used in the area.
Sustainable agriculture: Agriculture is essential to the survival of all human activities [ 109 ]. Sustainable agriculture practices help to improve agricultural productivity while also reducing negative impacts on the environment [ 5 , 25 , 109 ]. The sustainable agriculture supply chains are knowledge-intensive and based on information, skills, technologies, etc., where knowledge transfer encourages farmers to enhance their decisions to adopt sustainable agriculture practices utilizing the increasing amount of data captured by emerging technologies, e.g., the Internet of Things (IoT), mobile technologies and devices, etc. [ 5 , 53 , 54 ]. Machine learning can be applied in various phases of sustainable agriculture, such as in the pre-production phase - for the prediction of crop yield, soil properties, irrigation requirements, etc.; in the production phase—for weather prediction, disease detection, weed detection, soil nutrient management, livestock management, etc.; in processing phase—for demand estimation, production planning, etc. and in the distribution phase - the inventory management, consumer analysis, etc.
User behavior analytics and context-aware smartphone applications: Context-awareness is a system’s ability to capture knowledge about its surroundings at any moment and modify behaviors accordingly [ 28 , 93 ]. Context-aware computing uses software and hardware to automatically collect and interpret data for direct responses. The mobile app development environment has been changed greatly with the power of AI, particularly, machine learning techniques through their learning capabilities from contextual data [ 103 , 136 ]. Thus, the developers of mobile apps can rely on machine learning to create smart apps that can understand human behavior, support, and entertain users [ 107 , 137 , 140 ]. To build various personalized data-driven context-aware systems, such as smart interruption management, smart mobile recommendation, context-aware smart searching, decision-making that intelligently assist end mobile phone users in a pervasive computing environment, machine learning techniques are applicable. For example, context-aware association rules can be used to build an intelligent phone call application [ 104 ]. Clustering approaches are useful in capturing users’ diverse behavioral activities by taking into account data in time series [ 102 ]. To predict the future events in various contexts, the classification methods can be used [ 106 , 139 ]. Thus, various learning techniques discussed in Sect. “ Machine Learning Tasks and Algorithms ” can help to build context-aware adaptive and smart applications according to the preferences of the mobile phone users.

In addition to these application areas, machine learning-based models can also apply to several other domains such as bioinformatics, cheminformatics, computer networks, DNA sequence classification, economics and banking, robotics, advanced engineering, and many more.

Challenges and Research Directions

Our study on machine learning algorithms for intelligent data analysis and applications opens several research issues in the area. Thus, in this section, we summarize and discuss the challenges faced and the potential research opportunities and future directions.

In general, the effectiveness and the efficiency of a machine learning-based solution depend on the nature and characteristics of the data, and the performance of the learning algorithms. To collect the data in the relevant domain, such as cybersecurity, IoT, healthcare and agriculture discussed in Sect. “ Applications of Machine Learning ” is not straightforward, although the current cyberspace enables the production of a huge amount of data with very high frequency. Thus, collecting useful data for the target machine learning-based applications, e.g., smart city applications, and their management is important to further analysis. Therefore, a more in-depth investigation of data collection methods is needed while working on the real-world data. Moreover, the historical data may contain many ambiguous values, missing values, outliers, and meaningless data. The machine learning algorithms, discussed in Sect “ Machine Learning Tasks and Algorithms ” highly impact on data quality, and availability for training, and consequently on the resultant model. Thus, to accurately clean and pre-process the diverse data collected from diverse sources is a challenging task. Therefore, effectively modifying or enhance existing pre-processing methods, or proposing new data preparation techniques are required to effectively use the learning algorithms in the associated application domain.

To analyze the data and extract insights, there exist many machine learning algorithms, summarized in Sect. “ Machine Learning Tasks and Algorithms ”. Thus, selecting a proper learning algorithm that is suitable for the target application is challenging. The reason is that the outcome of different learning algorithms may vary depending on the data characteristics [ 106 ]. Selecting a wrong learning algorithm would result in producing unexpected outcomes that may lead to loss of effort, as well as the model’s effectiveness and accuracy. In terms of model building, the techniques discussed in Sect. “ Machine Learning Tasks and Algorithms ” can directly be used to solve many real-world issues in diverse domains, such as cybersecurity, smart cities and healthcare summarized in Sect. “ Applications of Machine Learning ”. However, the hybrid learning model, e.g., the ensemble of methods, modifying or enhancement of the existing learning techniques, or designing new learning methods, could be a potential future work in the area.

Thus, the ultimate success of a machine learning-based solution and corresponding applications mainly depends on both the data and the learning algorithms. If the data are bad to learn, such as non-representative, poor-quality, irrelevant features, or insufficient quantity for training, then the machine learning models may become useless or will produce lower accuracy. Therefore, effectively processing the data and handling the diverse learning algorithms are important, for a machine learning-based solution and eventually building intelligent applications.

In this paper, we have conducted a comprehensive overview of machine learning algorithms for intelligent data analysis and applications. According to our goal, we have briefly discussed how various types of machine learning methods can be used for making solutions to various real-world issues. A successful machine learning model depends on both the data and the performance of the learning algorithms. The sophisticated learning algorithms then need to be trained through the collected real-world data and knowledge related to the target application before the system can assist with intelligent decision-making. We also discussed several popular application areas based on machine learning techniques to highlight their applicability in various real-world issues. Finally, we have summarized and discussed the challenges faced and the potential research opportunities and future directions in the area. Therefore, the challenges that are identified create promising research opportunities in the field which must be addressed with effective solutions in various application areas. Overall, we believe that our study on machine learning-based solutions opens up a promising direction and can be used as a reference guide for potential research and applications for both academia and industry professionals as well as for decision-makers, from a technical point of view.

Declaration

The author declares no conflict of interest.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M. Shivakumar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Frequently Asked Questions

JMLR Volume 24

Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search Benjamin Moseley, Joshua R. Wang ; (1):1−36, 2023. [ abs ][ pdf ][ bib ]

The Brier Score under Administrative Censoring: Problems and a Solution Håvard Kvamme, Ørnulf Borgan ; (2):1−26, 2023. [ abs ][ pdf ][ bib ]

Bayesian Spiked Laplacian Graphs Leo L Duan, George Michailidis, Mingzhou Ding ; (3):1−35, 2023. [ abs ][ pdf ][ bib ] [ code ]

Efficient Structure-preserving Support Tensor Train Machine Kirandeep Kour, Sergey Dolgov, Martin Stoll, Peter Benner ; (4):1−22, 2023. [ abs ][ pdf ][ bib ] [ code ]

Cluster-Specific Predictions with Multi-Task Gaussian Processes Arthur Leroy, Pierre Latouche, Benjamin Guedj, Servane Gey ; (5):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

AutoKeras: An AutoML Library for Deep Learning Haifeng Jin, François Chollet, Qingquan Song, Xia Hu ; (6):1−6, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

On Distance and Kernel Measures of Conditional Dependence Tianhong Sheng, Bharath K. Sriperumbudur ; (7):1−16, 2023. [ abs ][ pdf ][ bib ]

A Relaxed Inertial Forward-Backward-Forward Algorithm for Solving Monotone Inclusions with Application to GANs Radu I. Bot, Michael Sedlmayer, Phan Tu Vuong ; (8):1−37, 2023. [ abs ][ pdf ][ bib ]

Sampling random graph homomorphisms and applications to network data analysis Hanbaek Lyu, Facundo Memoli, David Sivakoff ; (9):1−79, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Line-Search Descent Algorithm for Strict Saddle Functions with Complexity Guarantees Michael J. O'Neill, Stephen J. Wright ; (10):1−34, 2023. [ abs ][ pdf ][ bib ]

Optimal Strategies for Reject Option Classifiers Vojtech Franc, Daniel Prusa, Vaclav Voracek ; (11):1−49, 2023. [ abs ][ pdf ][ bib ]

Learning-augmented count-min sketches via Bayesian nonparametrics Emanuele Dolera, Stefano Favaro, Stefano Peluchetti ; (12):1−60, 2023. [ abs ][ pdf ][ bib ]

Adaptation to the Range in K-Armed Bandits Hédi Hadiji, Gilles Stoltz ; (13):1−33, 2023. [ abs ][ pdf ][ bib ]

Python package for causal discovery based on LiNGAM Takashi Ikeuchi, Mayumi Ide, Yan Zeng, Takashi Nicholas Maeda, Shohei Shimizu ; (14):1−8, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions Jon Vadillo, Roberto Santana, Jose A. Lozano ; (15):1−42, 2023. [ abs ][ pdf ][ bib ] [ code ]

Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation Cynthia Rudin, Yaron Shaposhnik ; (16):1−44, 2023. [ abs ][ pdf ][ bib ] [ code ]

Learning Mean-Field Games with Discounted and Average Costs Berkay Anahtarci, Can Deha Kariksiz, Naci Saldi ; (17):1−59, 2023. [ abs ][ pdf ][ bib ]

An Inertial Block Majorization Minimization Framework for Nonsmooth Nonconvex Optimization Le Thi Khanh Hien, Duy Nhat Phan, Nicolas Gillis ; (18):1−41, 2023. [ abs ][ pdf ][ bib ] [ code ]

Regularized Joint Mixture Models Konstantinos Perrakis, Thomas Lartigue, Frank Dondelinger, Sach Mukherjee ; (19):1−47, 2023. [ abs ][ pdf ][ bib ] [ code ]

Interpolating Classifiers Make Few Mistakes Tengyuan Liang, Benjamin Recht ; (20):1−27, 2023. [ abs ][ pdf ][ bib ]

Graph-Aided Online Multi-Kernel Learning Pouya M. Ghari, Yanning Shen ; (21):1−44, 2023. [ abs ][ pdf ][ bib ] [ code ]

Lower Bounds and Accelerated Algorithms for Bilevel Optimization Kaiyi ji, Yingbin Liang ; (22):1−56, 2023. [ abs ][ pdf ][ bib ]

Bayesian Data Selection Eli N. Weinstein, Jeffrey W. Miller ; (23):1−72, 2023. [ abs ][ pdf ][ bib ] [ code ]

Calibrated Multiple-Output Quantile Regression with Representation Learning Shai Feldman, Stephen Bates, Yaniv Romano ; (24):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

Discrete Variational Calculus for Accelerated Optimization Cédric M. Campos, Alejandro Mahillo, David Martín de Diego ; (25):1−33, 2023. [ abs ][ pdf ][ bib ] [ code ]

Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels Hao Wang, Rui Gao, Flavio P. Calmon ; (26):1−43, 2023. [ abs ][ pdf ][ bib ]

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time Raj Agrawal, Tamara Broderick ; (27):1−60, 2023. [ abs ][ pdf ][ bib ]

Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping XuranMeng, JeffYao ; (28):1−40, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

HiClass: a Python Library for Local Hierarchical Classification Compatible with Scikit-learn Fábio M. Miranda, Niklas Köhnecke, Bernhard Y. Renard ; (29):1−17, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Attacks against Federated Learning Defense Systems and their Mitigation Cody Lewis, Vijay Varadharajan, Nasimul Noman ; (30):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Labels, Information, and Computation: Efficient Learning Using Sufficient Labels Shiyu Duan, Spencer Chang, Jose C. Principe ; (31):1−35, 2023. [ abs ][ pdf ][ bib ]

Sparse PCA: a Geometric Approach Dimitris Bertsimas, Driss Lahlou Kitane ; (32):1−33, 2023. [ abs ][ pdf ][ bib ]

Gap Minimization for Knowledge Sharing and Transfer Boyu Wang, Jorge A. Mendez, Changjian Shui, Fan Zhou, Di Wu, Gezheng Xu, Christian Gagné, Eric Eaton ; (33):1−57, 2023. [ abs ][ pdf ][ bib ] [ code ]

Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond Anna Hedström, Leander Weber, Daniel Krakowczyk, Dilyara Bareeva, Franz Motzkus, Wojciech Samek, Sebastian Lapuschkin, Marina M.-C. Höhne ; (34):1−11, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? Han Zhong, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan ; (35):1−52, 2023. [ abs ][ pdf ][ bib ]

Label Distribution Changing Learning with Sample Space Expanding Chao Xu, Hong Tao, Jing Zhang, Dewen Hu, Chenping Hou ; (36):1−48, 2023. [ abs ][ pdf ][ bib ]

Ridges, Neural Networks, and the Radon Transform Michael Unser ; (37):1−33, 2023. [ abs ][ pdf ][ bib ]

First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems Michael I. Jordan, Tianyi Lin, Manolis Zampetakis ; (38):1−46, 2023. [ abs ][ pdf ][ bib ]

Sensing Theorems for Unsupervised Learning in Linear Inverse Problems Julián Tachella, Dongdong Chen, Mike Davies ; (39):1−45, 2023. [ abs ][ pdf ][ bib ]

On Batch Teaching Without Collusion Shaun Fallat, David Kirkpatrick, Hans U. Simon, Abolghasem Soltani, Sandra Zilles ; (40):1−33, 2023. [ abs ][ pdf ][ bib ]

Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data Shaowu Pan, Steven L. Brunton, J. Nathan Kutz ; (41):1−60, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zachary Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan ; (42):1−63, 2023. [ abs ][ pdf ][ bib ] [ code ]

Benchmarking Graph Neural Networks Vijay Prakash Dwivedi, Chaitanya K. Joshi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, Xavier Bresson ; (43):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

Robust Load Balancing with Machine Learned Advice Sara Ahmadian, Hossein Esfandiari, Vahab Mirrokni, Binghui Peng ; (44):1−46, 2023. [ abs ][ pdf ][ bib ]

The multimarginal optimal transport formulation of adversarial multiclass classification Nicolás García Trillos, Matt Jacobs, Jakwang Kim ; (45):1−56, 2023. [ abs ][ pdf ][ bib ]

The d-Separation Criterion in Categorical Probability Tobias Fritz, Andreas Klingler ; (46):1−49, 2023. [ abs ][ pdf ][ bib ]

A Group-Theoretic Approach to Computational Abstraction: Symmetry-Driven Hierarchical Clustering Haizi Yu, Igor Mineyev, Lav R. Varshney ; (47):1−61, 2023. [ abs ][ pdf ][ bib ]

On the Convergence of Stochastic Gradient Descent with Bandwidth-based Step Size Xiaoyu Wang, Ya-xiang Yuan ; (48):1−49, 2023. [ abs ][ pdf ][ bib ]

Reinforcement Learning for Joint Optimization of Multiple Rewards Mridul Agarwal, Vaneet Aggarwal ; (49):1−41, 2023. [ abs ][ pdf ][ bib ]

Convergence Rates of a Class of Multivariate Density Estimation Methods Based on Adaptive Partitioning Linxi Liu, Dangna Li, Wing Hung Wong ; (50):1−64, 2023. [ abs ][ pdf ][ bib ]

Online Change-Point Detection in High-Dimensional Covariance Structure with Application to Dynamic Networks Lingjun Li, Jun Li ; (51):1−44, 2023. [ abs ][ pdf ][ bib ]

Necessary and Sufficient Conditions for Inverse Reinforcement Learning of Bayesian Stopping Time Problems Kunal Pattanayak, Vikram Krishnamurthy ; (52):1−64, 2023. [ abs ][ pdf ][ bib ]

VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback Kirthevasan Kandasamy, Joseph E Gonzalez, Michael I Jordan, Ion Stoica ; (53):1−45, 2023. [ abs ][ pdf ][ bib ]

Contextual Stochastic Block Model: Sharp Thresholds and Contiguity Chen Lu, Subhabrata Sen ; (54):1−34, 2023. [ abs ][ pdf ][ bib ]

Kernel-based estimation for partially functional linear model: Minimax rates and randomized sketches Shaogao Lv, Xin He, Junhui Wang ; (55):1−38, 2023. [ abs ][ pdf ][ bib ]

On the geometry of Stein variational gradient descent Andrew Duncan, Nikolas Nüsken, Lukasz Szpruch ; (56):1−39, 2023. [ abs ][ pdf ][ bib ]

Tree-AMP: Compositional Inference with Tree Approximate Message Passing Antoine Baker, Florent Krzakala, Benjamin Aubin, Lenka Zdeborová ; (57):1−89, 2023. [ abs ][ pdf ][ bib ] [ code ]

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval Yan Shuo Tan, Roman Vershynin ; (58):1−47, 2023. [ abs ][ pdf ][ bib ]

Topological Convolutional Layers for Deep Learning Ephy R. Love, Benjamin Filippenko, Vasileios Maroulas, Gunnar Carlsson ; (59):1−35, 2023. [ abs ][ pdf ][ bib ]

Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints Qinbo Bai, Vaneet Aggarwal, Ather Gattami ; (60):1−25, 2023. [ abs ][ pdf ][ bib ]

Density estimation on low-dimensional manifolds: an inflation-deflation approach Christian Horvat, Jean-Pascal Pfister ; (61):1−37, 2023. [ abs ][ pdf ][ bib ] [ code ]

Monotonic Alpha-divergence Minimisation for Variational Inference Kamélia Daudel, Randal Douc, François Roueff ; (62):1−76, 2023. [ abs ][ pdf ][ bib ]

On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results Marcelo Arenas, Pablo Barcelo, Leopoldo Bertossi, Mikael Monet ; (63):1−58, 2023. [ abs ][ pdf ][ bib ]

Fundamental limits and algorithms for sparse linear regression with sublinear sparsity Lan V. Truong ; (64):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule Nikhil Iyer, V. Thejas, Nipun Kwatra, Ramachandran Ramjee, Muthian Sivathanu ; (65):1−37, 2023. [ abs ][ pdf ][ bib ] [ code ]

Posterior Contraction for Deep Gaussian Process Priors Gianluca Finocchio, Johannes Schmidt-Hieber ; (66):1−49, 2023. [ abs ][ pdf ][ bib ]

Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching Eliezer de Souza da Silva, Tomasz Kuśmierczyk, Marcelo Hartmann, Arto Klami ; (67):1−51, 2023. [ abs ][ pdf ][ bib ] [ code ]

Distributed Nonparametric Regression Imputation for Missing Response Problems with Large-scale Data Ruoyu Wang, Miaomiao Su, Qihua Wang ; (68):1−52, 2023. [ abs ][ pdf ][ bib ] [ code ]

When Locally Linear Embedding Hits Boundary Hau-Tieng Wu, Nan Wu ; (69):1−80, 2023. [ abs ][ pdf ][ bib ]

Optimizing ROC Curves with a Sort-Based Surrogate Loss for Binary Classification and Changepoint Detection Jonathan Hillman, Toby Dylan Hocking ; (70):1−24, 2023. [ abs ][ pdf ][ bib ] [ code ]

Kernel-Matrix Determinant Estimates from stopped Cholesky Decomposition Simon Bartels, Wouter Boomsma, Jes Frellsen, Damien Garreau ; (71):1−57, 2023. [ abs ][ pdf ][ bib ] [ code ]

How Do You Want Your Greedy: Simultaneous or Repeated? Moran Feldman, Christopher Harshaw, Amin Karbasi ; (72):1−87, 2023. [ abs ][ pdf ][ bib ] [ code ]

Inference for a Large Directed Acyclic Graph with Unspecified Interventions Chunlin Li, Xiaotong Shen, Wei Pan ; (73):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

Privacy-Aware Rejection Sampling Jordan Awan, Vinayak Rao ; (74):1−32, 2023. [ abs ][ pdf ][ bib ]

Intrinsic Persistent Homology via Density-based Metric Learning Ximena Fernández, Eugenio Borghini, Gabriel Mindlin, Pablo Groisman ; (75):1−42, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Randomized Subspace-based Approach for Dimensionality Reduction and Important Variable Selection Di Bo, Hoon Hwangbo, Vinit Sharma, Corey Arndt, Stephanie TerMaath ; (76):1−31, 2023. [ abs ][ pdf ][ bib ]

A Likelihood Approach to Nonparametric Estimation of a Singular Distribution Using Deep Generative Models Minwoo Chae, Dongha Kim, Yongdai Kim, Lizhen Lin ; (77):1−42, 2023. [ abs ][ pdf ][ bib ]

Towards Learning to Imitate from a Single Video Demonstration Glen Berseth, Florian Golemo, Christopher Pal ; (78):1−26, 2023. [ abs ][ pdf ][ bib ]

Approximate Post-Selective Inference for Regression with the Group LASSO Snigdha Panigrahi, Peter W MacDonald, Daniel Kessler ; (79):1−49, 2023. [ abs ][ pdf ][ bib ]

Temporal Abstraction in Reinforcement Learning with the Successor Representation Marlos C. Machado, Andre Barreto, Doina Precup, Michael Bowling ; (80):1−69, 2023. [ abs ][ pdf ][ bib ]

Fast Online Changepoint Detection via Functional Pruning CUSUM Statistics Gaetano Romano, Idris A. Eckley, Paul Fearnhead, Guillem Rigaill ; (81):1−36, 2023. [ abs ][ pdf ][ bib ] [ code ]

Iterated Block Particle Filter for High-dimensional Parameter Learning: Beating the Curse of Dimensionality Ning Ning, Edward L. Ionides ; (82):1−76, 2023. [ abs ][ pdf ][ bib ]

Bayes-Newton Methods for Approximate Bayesian Inference with PSD Guarantees William J. Wilkinson, Simo Särkkä, Arno Solin ; (83):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Online Optimization over Riemannian Manifolds Xi Wang, Zhipeng Tu, Yiguang Hong, Yingyi Wu, Guodong Shi ; (84):1−67, 2023. [ abs ][ pdf ][ bib ] [ code ]

Doubly Robust Stein-Kernelized Monte Carlo Estimator: Simultaneous Bias-Variance Reduction and Supercanonical Convergence Henry Lam, Haofeng Zhang ; (85):1−58, 2023. [ abs ][ pdf ][ bib ]

Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces George Stepaniants ; (86):1−72, 2023. [ abs ][ pdf ][ bib ] [ code ]

Gaussian Processes with Errors in Variables: Theory and Computation Shuang Zhou, Debdeep Pati, Tianying Wang, Yun Yang, Raymond J. Carroll ; (87):1−53, 2023. [ abs ][ pdf ][ bib ]

Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data Yuqi Gu, Elena E. Erosheva, Gongjun Xu, David B. Dunson ; (88):1−49, 2023. [ abs ][ pdf ][ bib ]

Neural Operator: Learning Maps Between Function Spaces With Applications to PDEs Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar ; (89):1−97, 2023. [ abs ][ pdf ][ bib ] [ code ]

Outlier-Robust Subsampling Techniques for Persistent Homology Bernadette J. Stolz ; (90):1−35, 2023. [ abs ][ pdf ][ bib ] [ code ]

Recursive Quantile Estimation: Non-Asymptotic Confidence Bounds Likai Chen, Georg Keilbar, Wei Biao Wu ; (91):1−25, 2023. [ abs ][ pdf ][ bib ]

Non-Asymptotic Guarantees for Robust Statistical Learning under Infinite Variance Assumption Lihu Xu, Fang Yao, Qiuran Yao, Huiming Zhang ; (92):1−46, 2023. [ abs ][ pdf ][ bib ]

Decentralized Learning: Theoretical Optimality and Practical Improvements Yucheng Lu, Christopher De Sa ; (93):1−62, 2023. [ abs ][ pdf ][ bib ]

Faith-Shap: The Faithful Shapley Interaction Index Che-Ping Tsai, Chih-Kuan Yeh, Pradeep Ravikumar ; (94):1−42, 2023. [ abs ][ pdf ][ bib ]

Statistical Inference for Noisy Incomplete Binary Matrix Yunxiao Chen, Chengcheng Li, Jing Ouyang, Gongjun Xu ; (95):1−66, 2023. [ abs ][ pdf ][ bib ]

Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization Jianhao Ma, Salar Fattahi ; (96):1−84, 2023. [ abs ][ pdf ][ bib ]

Fitting Autoregressive Graph Generative Models through Maximum Likelihood Estimation Xu Han, Xiaohui Chen, Francisco J. R. Ruiz, Li-Ping Liu ; (97):1−30, 2023. [ abs ][ pdf ][ bib ] [ code ]

An Analysis of Robustness of Non-Lipschitz Networks Maria-Florina Balcan, Avrim Blum, Dravyansh Sharma, Hongyang Zhang ; (98):1−43, 2023. [ abs ][ pdf ][ bib ] [ code ]

Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity Artem Vysogorets, Julia Kempe ; (99):1−23, 2023. [ abs ][ pdf ][ bib ]

FedLab: A Flexible Federated Learning Framework Dun Zeng, Siqi Liang, Xiangjing Hu, Hui Wang, Zenglin Xu ; (100):1−7, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Inference for Gaussian Processes with Matern Covariogram on Compact Riemannian Manifolds Didong Li, Wenpin Tang, Sudipto Banerjee ; (101):1−26, 2023. [ abs ][ pdf ][ bib ]

Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments Haixu Ma, Donglin Zeng, Yufeng Liu ; (102):1−48, 2023. [ abs ][ pdf ][ bib ]

Sparse Training with Lipschitz Continuous Loss Functions and a Weighted Group L0-norm Constraint Michael R. Metel ; (103):1−44, 2023. [ abs ][ pdf ][ bib ]

Intrinsic Gaussian Process on Unknown Manifolds with Probabilistic Metrics Mu Niu, Zhenwen Dai, Pokman Cheung, Yizhu Wang ; (104):1−42, 2023. [ abs ][ pdf ][ bib ]

Knowledge Hypergraph Embedding Meets Relational Algebra Bahare Fatemi, Perouz Taslakian, David Vazquez, David Poole ; (105):1−34, 2023. [ abs ][ pdf ][ bib ] [ code ]

Concentration analysis of multivariate elliptic diffusions Lukas Trottner, Cathrine Aeckerle-Willems, Claudia Strauch ; (106):1−38, 2023. [ abs ][ pdf ][ bib ]

Risk Bounds for Positive-Unlabeled Learning Under the Selected At Random Assumption Olivier Coudray, Christine Keribin, Pascal Massart, Patrick Pamphile ; (107):1−31, 2023. [ abs ][ pdf ][ bib ]

Bayesian Calibration of Imperfect Computer Models using Physics-Informed Priors Michail Spitieris, Ingelin Steinsland ; (108):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

Dimensionless machine learning: Imposing exact units equivariance Soledad Villar, Weichi Yao, David W. Hogg, Ben Blum-Smith, Bianca Dumitrascu ; (109):1−32, 2023. [ abs ][ pdf ][ bib ]

A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi ; (110):1−43, 2023. [ abs ][ pdf ][ bib ] [ code ]

FLIP: A Utility Preserving Privacy Mechanism for Time Series Tucker McElroy, Anindya Roy, Gaurab Hore ; (111):1−29, 2023. [ abs ][ pdf ][ bib ]

The Hyperspherical Geometry of Community Detection: Modularity as a Distance Martijn Gösgens, Remco van der Hofstad, Nelly Litvak ; (112):1−36, 2023. [ abs ][ pdf ][ bib ] [ code ]

The Implicit Bias of Benign Overfitting Ohad Shamir ; (113):1−40, 2023. [ abs ][ pdf ][ bib ]

Generalization Bounds for Adversarial Contrastive Learning Xin Zou, Weiwei Liu ; (114):1−54, 2023. [ abs ][ pdf ][ bib ]

Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition Chengzhuo Ni, Yaqi Duan, Munther Dahleh, Mengdi Wang, Anru R. Zhang ; (115):1−53, 2023. [ abs ][ pdf ][ bib ]

SQLFlow: An Extensible Toolkit Integrating DB and AI Jun Zhou, Ke Zhang, Lin Wang, Hua Wu, Yi Wang, ChaoChao Chen ; (116):1−9, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Deep linear networks can benignly overfit when shallow ones do Niladri S. Chatterji, Philip M. Long ; (117):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Unified Framework for Optimization-Based Graph Coarsening Manoj Kumar, Anurag Sharma, Sandeep Kumar ; (118):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

An Annotated Graph Model with Differential Degree Heterogeneity for Directed Networks Stefan Stein, Chenlei Leng ; (119):1−69, 2023. [ abs ][ pdf ][ bib ]

Maximum likelihood estimation in Gaussian process regression is ill-posed Toni Karvonen, Chris J. Oates ; (120):1−47, 2023. [ abs ][ pdf ][ bib ]

Minimal Width for Universal Property of Deep RNN Chang hoon Song, Geonho Hwang, Jun ho Lee, Myungjoo Kang ; (121):1−41, 2023. [ abs ][ pdf ][ bib ]

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities Brian R. Bartoldson, Bhavya Kailkhura, Davis Blalock ; (122):1−77, 2023. [ abs ][ pdf ][ bib ]

Benign overfitting in ridge regression Alexander Tsigler, Peter L. Bartlett ; (123):1−76, 2023. [ abs ][ pdf ][ bib ]

HiGrad: Uncertainty Quantification for Online Learning and Stochastic Approximation Weijie J. Su, Yuancheng Zhu ; (124):1−53, 2023. [ abs ][ pdf ][ bib ]

Statistical Robustness of Empirical Risks in Machine Learning Shaoyan Guo, Huifu Xu, Liwei Zhang ; (125):1−38, 2023. [ abs ][ pdf ][ bib ]

Euler-Lagrange Analysis of Generative Adversarial Networks Siddarth Asokan, Chandra Sekhar Seelamantula ; (126):1−100, 2023. [ abs ][ pdf ][ bib ] [ code ]

Graph Clustering with Graph Neural Networks Anton Tsitsulin, John Palowitch, Bryan Perozzi, Emmanuel Müller ; (127):1−21, 2023. [ abs ][ pdf ][ bib ] [ code ]

An Eigenmodel for Dynamic Multilayer Networks Joshua Daniel Loyal, Yuguo Chen ; (128):1−69, 2023. [ abs ][ pdf ][ bib ] [ code ]

A First Look into the Carbon Footprint of Federated Learning Xinchi Qiu, Titouan Parcollet, Javier Fernandez-Marques, Pedro P. B. Gusmao, Yan Gao, Daniel J. Beutel, Taner Topal, Akhil Mathur, Nicholas D. Lane ; (129):1−23, 2023. [ abs ][ pdf ][ bib ]

Combinatorial Optimization and Reasoning with Graph Neural Networks Quentin Cappart, Didier Chételat, Elias B. Khalil, Andrea Lodi, Christopher Morris, Petar Veličković ; (130):1−61, 2023. [ abs ][ pdf ][ bib ]

A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition Patricia Wollstadt, Sebastian Schmitt, Michael Wibral ; (131):1−44, 2023. [ abs ][ pdf ][ bib ]

Generalized Linear Models in Non-interactive Local Differential Privacy with Public Data Di Wang, Lijie Hu, Huanyu Zhang, Marco Gaboardi, Jinhui Xu ; (132):1−57, 2023. [ abs ][ pdf ][ bib ]

Exploiting Discovered Regression Discontinuities to Debias Conditioned-on-observable Estimators Benjamin Jakubowski, Sriram Somanchi, Edward McFowland III, Daniel B. Neill ; (133):1−57, 2023. [ abs ][ pdf ][ bib ] [ code ]

MARS: A Second-Order Reduction Algorithm for High-Dimensional Sparse Precision Matrices Estimation Qian Li, Binyan Jiang, Defeng Sun ; (134):1−44, 2023. [ abs ][ pdf ][ bib ]

Sparse GCA and Thresholded Gradient Descent Sheng Gao, Zongming Ma ; (135):1−61, 2023. [ abs ][ pdf ][ bib ]

Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection Wenhao Li, Ningyuan Chen, L. Jeff Hong ; (136):1−84, 2023. [ abs ][ pdf ][ bib ]

Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks Hui Jin, Guido Montufar ; (137):1−97, 2023. [ abs ][ pdf ][ bib ] [ code ]

Asymptotics of Network Embeddings Learned via Subsampling Andrew Davison, Morgane Austern ; (138):1−120, 2023. [ abs ][ pdf ][ bib ] [ code ]

Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games Ben Hambly, Renyuan Xu, Huining Yang ; (139):1−56, 2023. [ abs ][ pdf ][ bib ]

Jump Interval-Learning for Individualized Decision Making with Continuous Treatments Hengrui Cai, Chengchun Shi, Rui Song, Wenbin Lu ; (140):1−92, 2023. [ abs ][ pdf ][ bib ] [ code ]

Optimal Convergence Rates for Distributed Nystroem Approximation Jian Li, Yong Liu, Weiping Wang ; (141):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

On Tilted Losses in Machine Learning: Theory and Applications Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith ; (142):1−79, 2023. [ abs ][ pdf ][ bib ] [ code ]

Large sample spectral analysis of graph-based multi-manifold clustering Nicolas Garcia Trillos, Pengfei He, Chenghui Li ; (143):1−71, 2023. [ abs ][ pdf ][ bib ] [ code ]

Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering Noirrit Kiran Chandra, Antonio Canale, David B. Dunson ; (144):1−42, 2023. [ abs ][ pdf ][ bib ]

Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning Bokun Wang, Zhuoning Yuan, Yiming Ying, Tianbao Yang ; (145):1−46, 2023. [ abs ][ pdf ][ bib ] [ code ]

Off-Policy Actor-Critic with Emphatic Weightings Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White ; (146):1−63, 2023. [ abs ][ pdf ][ bib ] [ code ]

Stochastic Optimization under Distributional Drift Joshua Cutler, Dmitriy Drusvyatskiy, Zaid Harchaoui ; (147):1−56, 2023. [ abs ][ pdf ][ bib ]

Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang ; (148):1−63, 2023. [ abs ][ pdf ][ bib ]

Controlling Wasserstein Distances by Kernel Norms with Application to Compressive Statistical Learning Titouan Vayer, Rémi Gribonval ; (149):1−51, 2023. [ abs ][ pdf ][ bib ]

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan Zhang ; (150):1−12, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Generalization error bounds for multiclass sparse linear classifiers Tomer Levy, Felix Abramovich ; (151):1−35, 2023. [ abs ][ pdf ][ bib ]

Selective inference for k-means clustering Yiqun T. Chen, Daniela M. Witten ; (152):1−41, 2023. [ abs ][ pdf ][ bib ] [ code ]

Consistent Model-based Clustering using the Quasi-Bernoulli Stick-breaking Process Cheng Zeng, Jeffrey W Miller, Leo L Duan ; (153):1−32, 2023. [ abs ][ pdf ][ bib ] [ code ]

Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees Jonathan Brophy, Zayd Hammoudeh, Daniel Lowd ; (154):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

Adaptive Data Depth via Multi-Armed Bandits Tavor Baharav, Tze Leung Lai ; (155):1−29, 2023. [ abs ][ pdf ][ bib ]

Integrating Random Effects in Deep Neural Networks Giora Simchoni, Saharon Rosset ; (156):1−57, 2023. [ abs ][ pdf ][ bib ] [ code ]

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the in the O(epsilon^(-7/4)) Complexity Huan Li, Zhouchen Lin ; (157):1−37, 2023. [ abs ][ pdf ][ bib ] [ code ]

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees Hamid Reza Feyzmahdavian, Mikael Johansson ; (158):1−75, 2023. [ abs ][ pdf ][ bib ]

Infinite-dimensional optimization and Bayesian nonparametric learning of stochastic differential equations Arnab Ganguly, Riten Mitra, Jinpu Zhou ; (159):1−39, 2023. [ abs ][ pdf ][ bib ]

Multivariate Soft Rank via Entropy-Regularized Optimal Transport: Sample Efficiency and Generative Modeling Shoaib Bin Masud, Matthew Werenski, James M. Murphy, Shuchin Aeron ; (160):1−65, 2023. [ abs ][ pdf ][ bib ] [ code ]

q-Learning in Continuous Time Yanwei Jia, Xun Yu Zhou ; (161):1−61, 2023. [ abs ][ pdf ][ bib ] [ code ]

Flexible Model Aggregation for Quantile Regression Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani ; (162):1−45, 2023. [ abs ][ pdf ][ bib ] [ code ]

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification Gavin Zhang, Salar Fattahi, Richard Y. Zhang ; (163):1−55, 2023. [ abs ][ pdf ][ bib ]

A Framework and Benchmark for Deep Batch Active Learning for Regression David Holzmüller, Viktor Zaverkin, Johannes Kästner, Ingo Steinwart ; (164):1−81, 2023. [ abs ][ pdf ][ bib ] [ code ]

Robust Methods for High-Dimensional Linear Learning Ibrahim Merad, Stéphane Gaïffas ; (165):1−44, 2023. [ abs ][ pdf ][ bib ]

A Parameter-Free Conditional Gradient Method for Composite Minimization under Hölder Condition Masaru Ito, Zhaosong Lu, Chuan He ; (166):1−34, 2023. [ abs ][ pdf ][ bib ]

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo ; (167):1−37, 2023. [ abs ][ pdf ][ bib ] [ code ]

Inference on the Change Point under a High Dimensional Covariance Shift Abhishek Kaul, Hongjin Zhang, Konstantinos Tsampourakis, George Michailidis ; (168):1−68, 2023. [ abs ][ pdf ][ bib ]

DART: Distance Assisted Recursive Testing Xuechan Li, Anthony D. Sung, Jichun Xie ; (169):1−41, 2023. [ abs ][ pdf ][ bib ]

Small Transformers Compute Universal Metric Embeddings Anastasis Kratsios, Valentin Debarnot, Ivan Dokmanić ; (170):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

Incremental Learning in Diagonal Linear Networks Raphaël Berthier ; (171):1−26, 2023. [ abs ][ pdf ][ bib ]

Beyond the Golden Ratio for Variational Inequality Algorithms Ahmet Alacaoglu, Axel Böhm, Yura Malitsky ; (172):1−33, 2023. [ abs ][ pdf ][ bib ] [ code ]

From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions Johannes Resin ; (173):1−21, 2023. [ abs ][ pdf ][ bib ]

Posterior Consistency for Bayesian Relevance Vector Machines Xiao Fang, Malay Ghosh ; (174):1−17, 2023. [ abs ][ pdf ][ bib ]

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity Kaiqing Zhang, Sham M. Kakade, Tamer Basar, Lin F. Yang ; (175):1−53, 2023. [ abs ][ pdf ][ bib ]

Evaluating Instrument Validity using the Principle of Independent Mechanisms Patrick F. Burauel ; (176):1−56, 2023. [ abs ][ pdf ][ bib ]

Comprehensive Algorithm Portfolio Evaluation using Item Response Theory Sevvandi Kandanaarachchi, Kate Smith-Miles ; (177):1−52, 2023. [ abs ][ pdf ][ bib ] [ code ]

F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning Wenhao Li, Bo Jin, Xiangfeng Wang, Junchi Yan, Hongyuan Zha ; (178):1−75, 2023. [ abs ][ pdf ][ bib ]

Variational Inference for Deblending Crowded Starfields Runjing Liu, Jon D. McAuliffe, Jeffrey Regier, The LSST Dark Energy Science Collaboration ; (179):1−36, 2023. [ abs ][ pdf ][ bib ] [ code ]

Dropout Training is Distributionally Robust Optimal José Blanchet, Yang Kang, José Luis Montiel Olea, Viet Anh Nguyen, Xuhui Zhang ; (180):1−60, 2023. [ abs ][ pdf ][ bib ]

Factor Graph Neural Networks Zhen Zhang, Mohammed Haroon Dupty, Fan Wu, Javen Qinfeng Shi, Wee Sun Lee ; (181):1−54, 2023. [ abs ][ pdf ][ bib ] [ code ]

Naive regression requires weaker assumptions than factor models to adjust for multiple cause confounding Justin Grimmer, Dean Knox, Brandon Stewart ; (182):1−70, 2023. [ abs ][ pdf ][ bib ]

Quasi-Equivalence between Width and Depth of Neural Networks Fenglei Fan, Rongjie Lai, Ge Wang ; (183):1−22, 2023. [ abs ][ pdf ][ bib ]

Metrizing Weak Convergence with Maximum Mean Discrepancies Carl-Johann Simon-Gabriel, Alessandro Barp, Bernhard Schölkopf, Lester Mackey ; (184):1−20, 2023. [ abs ][ pdf ][ bib ]

On the Theoretical Equivalence of Several Trade-Off Curves Assessing Statistical Proximity Rodrigue Siry, Ryan Webster, Loic Simon, Julien Rabin ; (185):1−34, 2023. [ abs ][ pdf ][ bib ]

Learning an Explicit Hyper-parameter Prediction Function Conditioned on Tasks Jun Shu, Deyu Meng, Zongben Xu ; (186):1−74, 2023. [ abs ][ pdf ][ bib ] [ code ]

Quantifying Network Similarity using Graph Cumulants Gecia Bravo-Hermsdorff, Lee M. Gunderson, Pierre-André Maugis, Carey E. Priebe ; (187):1−27, 2023. [ abs ][ pdf ][ bib ] [ code ]

The Proximal ID Algorithm Ilya Shpitser, Zach Wood-Doughty, Eric J. Tchetgen Tchetgen ; (188):1−46, 2023. [ abs ][ pdf ][ bib ] [ code ]

Random Feature Neural Networks Learn Black-Scholes Type PDEs Without Curse of Dimensionality Lukas Gonon ; (189):1−51, 2023. [ abs ][ pdf ][ bib ]

Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees Solveig Klepper, Christian Elbracht, Diego Fioravanti, Jakob Kneip, Luca Rendsburg, Maximilian Teegen, Ulrike von Luxburg ; (190):1−56, 2023. [ abs ][ pdf ][ bib ] [ code ]

Insights into Ordinal Embedding Algorithms: A Systematic Evaluation Leena Chennuru Vankadara, Michael Lohaus, Siavash Haghiri, Faiz Ul Wahab, Ulrike von Luxburg ; (191):1−83, 2023. [ abs ][ pdf ][ bib ] [ code ]

PAC-learning for Strategic Classification Ravi Sundaram, Anil Vullikanti, Haifeng Xu, Fan Yao ; (192):1−38, 2023. [ abs ][ pdf ][ bib ]

Divide-and-Conquer Fusion Ryan S.Y. Chan, Murray Pollock, Adam M. Johansen, Gareth O. Roberts ; (193):1−82, 2023. [ abs ][ pdf ][ bib ]

MMD Aggregated Two-Sample Test Antonin Schrab, Ilmun Kim, Mélisande Albert, Béatrice Laurent, Benjamin Guedj, Arthur Gretton ; (194):1−81, 2023. [ abs ][ pdf ][ bib ] [ code ]

Clustering and Structural Robustness in Causal Diagrams Santtu Tikka, Jouni Helske, Juha Karvanen ; (195):1−32, 2023. [ abs ][ pdf ][ bib ] [ code ]

Variational Gibbs Inference for Statistical Model Estimation from Incomplete Data Vaidotas Simkus, Benjamin Rhodes, Michael U. Gutmann ; (196):1−72, 2023. [ abs ][ pdf ][ bib ] [ code ]

Contrasting Identifying Assumptions of Average Causal Effects: Robustness and Semiparametric Efficiency Tetiana Gorbach, Xavier de Luna, Juha Karvanen, Ingeborg Waernbaum ; (197):1−65, 2023. [ abs ][ pdf ][ bib ] [ code ]

CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges Adrien Pavao, Isabelle Guyon, Anne-Catherine Letournel, Dinh-Tuan Tran, Xavier Baro, Hugo Jair Escalante, Sergio Escalera, Tyler Thomas, Zhen Xu ; (198):1−6, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity Ali Kara, Naci Saldi, Serdar Yüksel ; (199):1−34, 2023. [ abs ][ pdf ][ bib ]

Model-based Causal Discovery for Zero-Inflated Count Data Junsouk Choi, Yang Ni ; (200):1−32, 2023. [ abs ][ pdf ][ bib ] [ code ]

Variational Inverting Network for Statistical Inverse Problems of Partial Differential Equations Junxiong Jia, Yanni Wu, Peijun Li, Deyu Meng ; (201):1−60, 2023. [ abs ][ pdf ][ bib ]

Multiplayer Performative Prediction: Learning in Decision-Dependent Games Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff ; (202):1−56, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Non-parametric View of FedAvg and FedProx:Beyond Stationary Points Lili Su, Jiaming Xu, Pengkun Yang ; (203):1−48, 2023. [ abs ][ pdf ][ bib ]

Buffered Asynchronous SGD for Byzantine Learning Yi-Rui Yang, Wu-Jun Li ; (204):1−62, 2023. [ abs ][ pdf ][ bib ]

L0Learn: A Scalable Package for Sparse Learning using L0 Regularization Hussein Hazimeh, Rahul Mazumder, Tim Nonet ; (205):1−8, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Non-stationary Online Learning with Memory and Non-stochastic Control Peng Zhao, Yu-Hu Yan, Yu-Xiang Wang, Zhi-Hua Zhou ; (206):1−70, 2023. [ abs ][ pdf ][ bib ]

Augmented Sparsifiers for Generalized Hypergraph Cuts Nate Veldt, Austin R. Benson, Jon Kleinberg ; (207):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Minimax Risk Classifiers with 0-1 Loss Santiago Mazuelas, Mauricio Romero, Peter Grunwald ; (208):1−48, 2023. [ abs ][ pdf ][ bib ]

LibMTL: A Python Library for Deep Multi-Task Learning Baijiong Lin, Yu Zhang ; (209):1−7, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

GFlowNet Foundations Yoshua Bengio, Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, Emmanuel Bengio ; (210):1−55, 2023. [ abs ][ pdf ][ bib ]

Entropic Fictitious Play for Mean Field Optimization Problem Fan Chen, Zhenjie Ren, Songbo Wang ; (211):1−36, 2023. [ abs ][ pdf ][ bib ]

An Inexact Augmented Lagrangian Algorithm for Training Leaky ReLU Neural Network with Group Sparsity Wei Liu, Xin Liu, Xiaojun Chen ; (212):1−43, 2023. [ abs ][ pdf ][ bib ]

Polynomial-Time Algorithms for Counting and Sampling Markov Equivalent DAGs with Applications Marcel Wienöbst, Max Bannach, Maciej Liśkiewicz ; (213):1−45, 2023. [ abs ][ pdf ][ bib ] [ code ]

An Empirical Investigation of the Role of Pre-training in Lifelong Learning Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, Emma Strubell ; (214):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Least Squares Model Averaging for Distributed Data Haili Zhang, Zhaobo Liu, Guohua Zou ; (215):1−59, 2023. [ abs ][ pdf ][ bib ]

Random Forests for Change Point Detection Malte Londschien, Peter Bühlmann, Solt Kovács ; (216):1−45, 2023. [ abs ][ pdf ][ bib ] [ code ]

GANs as Gradient Flows that Converge Yu-Jui Huang, Yuchong Zhang ; (217):1−40, 2023. [ abs ][ pdf ][ bib ]

Adaptation Augmented Model-based Policy Optimization Jian Shen, Hang Lai, Minghuan Liu, Han Zhao, Yong Yu, Weinan Zhang ; (218):1−35, 2023. [ abs ][ pdf ][ bib ]

Functional L-Optimality Subsampling for Functional Generalized Linear Models with Massive Data Hua Liu, Jinhong You, Jiguo Cao ; (219):1−41, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning Wei-Fang Sun, Cheng-Kuang Lee, Simon See, Chun-Yi Lee ; (220):1−32, 2023. [ abs ][ pdf ][ bib ] [ code ]

Multi-source Learning via Completion of Block-wise Overlapping Noisy Matrices Doudou Zhou, Tianxi Cai, Junwei Lu ; (221):1−43, 2023. [ abs ][ pdf ][ bib ] [ code ]

Single Timescale Actor-Critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees Mo Zhou, Jianfeng Lu ; (222):1−34, 2023. [ abs ][ pdf ][ bib ] [ code ]

Conditional Distribution Function Estimation Using Neural Networks for Censored and Uncensored Data Bingqing Hu, Bin Nan ; (223):1−26, 2023. [ abs ][ pdf ][ bib ] [ code ]

RankSEG: A Consistent Ranking-based Framework for Segmentation Ben Dai, Chunlin Li ; (224):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Limits of Dense Simplicial Complexes T. Mitchell Roddenberry, Santiago Segarra ; (225):1−42, 2023. [ abs ][ pdf ][ bib ]

Merlion: End-to-End Machine Learning for Time Series Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang ; (226):1−6, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Autoregressive Networks Binyan Jiang, Jialiang Li, Qiwei Yao ; (227):1−69, 2023. [ abs ][ pdf ][ bib ]

On the Optimality of Nuclear-norm-based Matrix Completion for Problems with Smooth Non-linear Structure Yunhua Xiang, Tianyu Zhang, Xu Wang, Ali Shojaie, Noah Simon ; (228):1−38, 2023. [ abs ][ pdf ][ bib ]

Interpretable and Fair Boolean Rule Sets via Column Generation Connor Lawless, Sanjeeb Dash, Oktay Gunluk, Dennis Wei ; (229):1−50, 2023. [ abs ][ pdf ][ bib ]

Sample Complexity for Distributionally Robust Learning under chi-square divergence Zhengyu Zhou, Weiwei Liu ; (230):1−27, 2023. [ abs ][ pdf ][ bib ]

Statistical Comparisons of Classifiers by Generalized Stochastic Dominance Christoph Jansen, Malte Nalenz, Georg Schollmeyer, Thomas Augustin ; (231):1−37, 2023. [ abs ][ pdf ][ bib ]

Lifted Bregman Training of Neural Networks Xiaoyu Wang, Martin Benning ; (232):1−51, 2023. [ abs ][ pdf ][ bib ] [ code ]

Strategic Knowledge Transfer Max Olan Smith, Thomas Anthony, Michael P. Wellman ; (233):1−96, 2023. [ abs ][ pdf ][ bib ]

MultiZoo and MultiBench: A Standardized Toolkit for Multimodal Deep Learning Paul Pu Liang, Yiwei Lyu, Xiang Fan, Arav Agarwal, Yun Cheng, Louis-Philippe Morency, Ruslan Salakhutdinov ; (234):1−7, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Tractable and Near-Optimal Adversarial Algorithms for Robust Estimation in Contaminated Gaussian Models Ziyue Wang, Zhiqiang Tan ; (235):1−112, 2023. [ abs ][ pdf ][ bib ] [ code ]

Neural Q-learning for solving PDEs Samuel N. Cohen, Deqing Jiang, Justin Sirignano ; (236):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

Scalable Computation of Causal Bounds Madhumitha Shridharan, Garud Iyengar ; (237):1−35, 2023. [ abs ][ pdf ][ bib ]

Efficient Computation of Rankings from Pairwise Comparisons M. E. J. Newman ; (238):1−25, 2023. [ abs ][ pdf ][ bib ]

Leaky Hockey Stick Loss: The First Negatively Divergent Margin-based Loss Function for Classification Oh-Ran Kwon, Hui Zou ; (239):1−40, 2023. [ abs ][ pdf ][ bib ] [ code ]

PaLM: Scaling Language Modeling with Pathways Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel ; (240):1−113, 2023. [ abs ][ pdf ][ bib ]

Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning Zhuang Yang ; (241):1−29, 2023. [ abs ][ pdf ][ bib ]

Sparse Graph Learning from Spatiotemporal Time Series Andrea Cini, Daniele Zambon, Cesare Alippi ; (242):1−36, 2023. [ abs ][ pdf ][ bib ]

Alpha-divergence Variational Inference Meets Importance Weighted Auto-Encoders: Methodology and Asymptotics Kamélia Daudel, Joe Benton, Yuyang Shi, Arnaud Doucet ; (243):1−83, 2023. [ abs ][ pdf ][ bib ]

Selection by Prediction with Conformal p-values Ying Jin, Emmanuel J. Candes ; (244):1−41, 2023. [ abs ][ pdf ][ bib ] [ code ]

Confidence Intervals and Hypothesis Testing for High-dimensional Quantile Regression: Convolution Smoothing and Debiasing Yibo Yan, Xiaozhou Wang, Riquan Zhang ; (245):1−49, 2023. [ abs ][ pdf ][ bib ]

Graph Attention Retrospective Kimon Fountoulakis, Amit Levi, Shenghao Yang, Aseem Baranwal, Aukosh Jagannath ; (246):1−52, 2023. [ abs ][ pdf ][ bib ] [ code ]

Importance Sparsification for Sinkhorn Algorithm Mengyu Li, Jun Yu, Tao Li, Cheng Meng ; (247):1−44, 2023. [ abs ][ pdf ][ bib ] [ code ]

Improving multiple-try Metropolis with local balancing Philippe Gagnon, Florian Maire, Giacomo Zanella ; (248):1−59, 2023. [ abs ][ pdf ][ bib ]

Unbiased Multilevel Monte Carlo Methods for Intractable Distributions: MLMC Meets MCMC Tianze Wang, Guanyang Wang ; (249):1−40, 2023. [ abs ][ pdf ][ bib ] [ code ]

Convex Reinforcement Learning in Finite Trials Mirco Mutti, Riccardo De Santi, Piersilvio De Bartolomeis, Marcello Restelli ; (250):1−42, 2023. [ abs ][ pdf ][ bib ]

Atlas: Few-shot Learning with Retrieval Augmented Language Models Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, Edouard Grave ; (251):1−43, 2023. [ abs ][ pdf ][ bib ] [ code ]

Adaptive False Discovery Rate Control with Privacy Guarantee Xintao Xia, Zhanrui Cai ; (252):1−35, 2023. [ abs ][ pdf ][ bib ]

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model Alexandra Sasha Luccioni, Sylvain Viguier, Anne-Laure Ligozat ; (253):1−15, 2023. [ abs ][ pdf ][ bib ] [ code ]

skrl: Modular and Flexible Library for Reinforcement Learning Antonio Serrano-Muñoz, Dimitrios Chrysostomou, Simon Bøgh, Nestor Arana-Arexolaleiba ; (254):1−9, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Torchhd: An Open Source Python Library to Support Research on Hyperdimensional Computing and Vector Symbolic Architectures Mike Heddes, Igor Nunes, Pere Vergés, Denis Kleyko, Danny Abraham, Tony Givargis, Alexandru Nicolau, Alexander Veidenbaum ; (255):1−10, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White ; (256):1−34, 2023. [ abs ][ pdf ][ bib ] [ code ]

Fairlearn: Assessing and Improving Fairness of AI Systems Hilde Weerts, Miroslav Dudík, Richard Edgar, Adrin Jalali, Roman Lutz, Michael Madaio ; (257):1−8, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Multi-view Collaborative Gaussian Process Dynamical Systems Shiliang Sun, Jingjing Fei, Jing Zhao, Liang Mao ; (258):1−32, 2023. [ abs ][ pdf ][ bib ]

Scalable high-dimensional Bayesian varying coefficient models with unknown within-subject covariance Ray Bai, Mary R. Boland, Yong Chen ; (259):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

Learning to Rank under Multinomial Logit Choice James A. Grant, David S. Leslie ; (260):1−49, 2023. [ abs ][ pdf ][ bib ]

Nearest Neighbor Dirichlet Mixtures Shounak Chattopadhyay, Antik Chakraborty, David B. Dunson ; (261):1−46, 2023. [ abs ][ pdf ][ bib ] [ code ]

Minimax Estimation for Personalized Federated Learning: An Alternative between FedAvg and Local Training? Shuxiao Chen, Qinqing Zheng, Qi Long, Weijie J. Su ; (262):1−59, 2023. [ abs ][ pdf ][ bib ]

Distributed Algorithms for U-statistics-based Empirical Risk Minimization Lanjue Chen, Alan T.K. Wan, Shuyi Zhang, Yong Zhou ; (263):1−43, 2023. [ abs ][ pdf ][ bib ]

ProtoryNet - Interpretable Text Classification Via Prototype Trajectories Dat Hong, Tong Wang, Stephen Baek ; (264):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

Surrogate Assisted Semi-supervised Inference for High Dimensional Risk Prediction Jue Hou, Zijian Guo, Tianxi Cai ; (265):1−58, 2023. [ abs ][ pdf ][ bib ]

On the Estimation of Derivatives Using Plug-in Kernel Ridge Regression Estimators Zejian Liu, Meng Li ; (266):1−37, 2023. [ abs ][ pdf ][ bib ]

Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach Dimitris Bertsimas, Ryan Cory-Wright, Nicholas A. G. Johnson ; (267):1−51, 2023. [ abs ][ pdf ][ bib ] [ code ]

Revisiting minimum description length complexity in overparameterized models Raaz Dwivedi, Chandan Singh, Bin Yu, Martin Wainwright ; (268):1−59, 2023. [ abs ][ pdf ][ bib ] [ code ]

Dynamic Ranking with the BTL Model: A Nearest Neighbor based Rank Centrality Method Eglantine Karlé, Hemant Tyagi ; (269):1−57, 2023. [ abs ][ pdf ][ bib ] [ code ]

Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation Xiao-Tong Yuan, Ping Li ; (270):1−52, 2023. [ abs ][ pdf ][ bib ]

Causal Discovery with Unobserved Confounding and Non-Gaussian Data Y. Samuel Wang, Mathias Drton ; (271):1−61, 2023. [ abs ][ pdf ][ bib ]

Distributed Sparse Regression via Penalization Yao Ji, Gesualdo Scutari, Ying Sun, Harsha Honnappa ; (272):1−62, 2023. [ abs ][ pdf ][ bib ]

Online Non-stochastic Control with Partial Feedback Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou ; (273):1−50, 2023. [ abs ][ pdf ][ bib ]

A Continuous-time Stochastic Gradient Descent Method for Continuous Data Kexin Jin, Jonas Latz, Chenguang Liu, Carola-Bibiane Schönlieb ; (274):1−48, 2023. [ abs ][ pdf ][ bib ]

Adaptive Clustering Using Kernel Density Estimators Ingo Steinwart, Bharath K. Sriperumbudur, Philipp Thomann ; (275):1−56, 2023. [ abs ][ pdf ][ bib ]

On Biased Compression for Distributed Learning Aleksandr Beznosikov, Samuel Horváth, Peter Richtárik, Mher Safaryan ; (276):1−50, 2023. [ abs ][ pdf ][ bib ]

Elastic Gradient Descent, an Iterative Optimization Method Approximating the Solution Paths of the Elastic Net Oskar Allerbo, Johan Jonasson, Rebecka Jörnsten ; (277):1−53, 2023. [ abs ][ pdf ][ bib ] [ code ]

Distinguishing Cause and Effect in Bivariate Structural Causal Models: A Systematic Investigation Christoph Käding,, Jakob Runge, ; (278):1−144, 2023. [ abs ][ pdf ][ bib ]

Sparse Markov Models for High-dimensional Inference Guilherme Ost, Daniel Y. Takahashi ; (279):1−54, 2023. [ abs ][ pdf ][ bib ]

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD Kun Yuan, Sulaiman A. Alghunaim, Xinmeng Huang ; (280):1−53, 2023. [ abs ][ pdf ][ bib ]

The Bayesian Learning Rule Mohammad Emtiyaz Khan, Håvard Rue ; (281):1−46, 2023. [ abs ][ pdf ][ bib ]

Community models for networks observed through edge nominations Tianxi Li, Elizaveta Levina, Ji Zhu ; (282):1−36, 2023. [ abs ][ pdf ][ bib ] [ code ]

Near-Optimal Weighted Matrix Completion Oscar López ; (283):1−40, 2023. [ abs ][ pdf ][ bib ]

A Complete Characterization of Linear Estimators for Offline Policy Evaluation Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade ; (284):1−50, 2023. [ abs ][ pdf ][ bib ]

Generic Unsupervised Optimization for a Latent Variable Model With Exponential Family Observables Hamid Mousavi, Jakob Drefs, Florian Hirschberger, Jörg Lücke ; (285):1−59, 2023. [ abs ][ pdf ][ bib ] [ code ]

Low Tree-Rank Bayesian Vector Autoregression Models Leo L Duan, Zeyu Yuwen, George Michailidis, Zhengwu Zhang ; (286):1−35, 2023. [ abs ][ pdf ][ bib ] [ code ]

Universal Approximation Property of Invertible Neural Networks Isao Ishikawa, Takeshi Teshima, Koichi Tojo, Kenta Oono, Masahiro Ikeda, Masashi Sugiyama ; (287):1−68, 2023. [ abs ][ pdf ][ bib ]

A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits Yasin Abbasi-Yadkori, András György, Nevena Lazić ; (288):1−37, 2023. [ abs ][ pdf ][ bib ]

Deep Neural Networks with Dependent Weights: Gaussian Process Mixture Limit, Heavy Tails, Sparsity and Compressibility Hoil Lee, Fadhel Ayed, Paul Jung, Juho Lee, Hongseok Yang, Francois Caron ; (289):1−78, 2023. [ abs ][ pdf ][ bib ] [ code ]

Deletion and Insertion Tests in Regression Models Naofumi Hama, Masayoshi Mase, Art B. Owen ; (290):1−38, 2023. [ abs ][ pdf ][ bib ]

A Unified Analysis of Multi-task Functional Linear Regression Models with Manifold Constraint and Composite Quadratic Penalty Shiyuan He, Hanxuan Ye, Kejun He ; (291):1−69, 2023. [ abs ][ pdf ][ bib ]

From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms Weijie Zheng, Benjamin Doerr ; (292):1−40, 2023. [ abs ][ pdf ][ bib ]

Augmented Transfer Regression Learning with Semi-non-parametric Nuisance Models Molei Liu, Yi Zhang, Katherine P. Liao, Tianxi Cai ; (293):1−50, 2023. [ abs ][ pdf ][ bib ]

Erratum: Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm Louis-Philippe Vignault, Audrey Durand, Pascal Germain ; (294):1−13, 2023. [ abs ][ pdf ][ bib ]

Weibull Racing Survival Analysis with Competing Events, Left Truncation, and Time-Varying Covariates Quan Zhang, Yanxun Xu, Mei-Cheng Wang, Mingyuan Zhou ; (295):1−43, 2023. [ abs ][ pdf ][ bib ]

High-Dimensional Inference for Generalized Linear Models with Hidden Confounding Jing Ouyang, Kean Ming Tan, Gongjun Xu ; (296):1−61, 2023. [ abs ][ pdf ][ bib ]

Causal Bandits for Linear Structural Equation Models Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer ; (297):1−59, 2023. [ abs ][ pdf ][ bib ]

A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning Arrasy Rahman, Ignacio Carlucho, Niklas Höpner, Stefano V. Albrecht ; (298):1−74, 2023. [ abs ][ pdf ][ bib ] [ code ]

A PDE approach for regret bounds under partial monitoring Erhan Bayraktar, Ibrahim Ekren, Xin Zhang ; (299):1−24, 2023. [ abs ][ pdf ][ bib ]

Sensitivity-Free Gradient Descent Algorithms Ion Matei, Maksym Zhenirovskyy, Johan de Kleer, John Maxwell ; (300):1−26, 2023. [ abs ][ pdf ][ bib ]

Learning Optimal Feedback Operators and their Sparse Polynomial Approximations Karl Kunisch, Donato Vásquez-Varas, Daniel Walter ; (301):1−38, 2023. [ abs ][ pdf ][ bib ]

Pivotal Estimation of Linear Discriminant Analysis in High Dimensions Ethan X. Fang, Yajun Mei, Yuyang Shi, Qunzhi Xu, Tuo Zhao ; (302):1−45, 2023. [ abs ][ pdf ][ bib ]

Random Feature Amplification: Feature Learning and Generalization in Neural Networks Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett ; (303):1−49, 2023. [ abs ][ pdf ][ bib ]

Two Sample Testing in High Dimension via Maximum Mean Discrepancy Hanjia Gao, Xiaofeng Shao ; (304):1−33, 2023. [ abs ][ pdf ][ bib ]

Continuous-in-time Limit for Bayesian Bandits Yuhua Zhu, Zachary Izzo, Lexing Ying ; (305):1−35, 2023. [ abs ][ pdf ][ bib ]

Multi-Consensus Decentralized Accelerated Gradient Descent Haishan Ye, Luo Luo, Ziang Zhou, Tong Zhang ; (306):1−50, 2023. [ abs ][ pdf ][ bib ]

Fast Screening Rules for Optimal Design via Quadratic Lasso Reformulation Guillaume Sagnol, Luc Pronzato ; (307):1−32, 2023. [ abs ][ pdf ][ bib ] [ code ]

Nevis'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research Jorg Bornschein, Alexandre Galashov, Ross Hemsley, Amal Rannen-Triki, Yutian Chen, Arslan Chaudhry, Xu Owen He, Arthur Douillard, Massimo Caccia, Qixuan Feng, Jiajun Shen, Sylvestre-Alvise Rebuffi, Kitty Stacpoole, Diego de las Casas, Will Hawkins, Angeliki Lazaridou, Yee Whye Teh, Andrei A. Rusu, Razvan Pascanu, Marc’Aurelio Ranzato ; (308):1−77, 2023. [ abs ][ pdf ][ bib ] [ code ]

Dimension Reduction and MARS Yu Liu LIU, Degui Li, Yingcun Xia ; (309):1−30, 2023. [ abs ][ pdf ][ bib ]

Prediction Equilibrium for Dynamic Network Flows Lukas Graf, Tobias Harks, Kostas Kollias, Michael Markl ; (310):1−33, 2023. [ abs ][ pdf ][ bib ] [ code ]

Microcanonical Hamiltonian Monte Carlo Jakob Robnik, G. Bruno De Luca, Eva Silverstein, Uroš Seljak ; (311):1−34, 2023. [ abs ][ pdf ][ bib ] [ code ]

The Measure and Mismeasure of Fairness Sam Corbett-Davies, Johann D. Gaebler, Hamed Nilforoshan, Ravi Shroff, Sharad Goel ; (312):1−117, 2023. [ abs ][ pdf ][ bib ] [ code ]

Zeroth-Order Alternating Gradient Descent Ascent Algorithms for A Class of Nonconvex-Nonconcave Minimax Problems Zi Xu, Zi-Qi Wang, Jun-Lin Wang, Yu-Hong Dai ; (313):1−25, 2023. [ abs ][ pdf ][ bib ]

Fast Expectation Propagation for Heteroscedastic, Lasso-Penalized, and Quantile Regression Jackson Zhou, John T. Ormerod, Clara Grazian ; (314):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang ; (315):1−23, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett, Philip M. Long, Olivier Bousquet ; (316):1−36, 2023. [ abs ][ pdf ][ bib ]

Mixed Regression via Approximate Message Passing Nelvin Tan, Ramji Venkataramanan ; (317):1−44, 2023. [ abs ][ pdf ][ bib ] [ code ]

Operator learning with PCA-Net: upper and lower complexity bounds Samuel Lanthaler ; (318):1−67, 2023. [ abs ][ pdf ][ bib ]

Bagging in overparameterized learning: Risk characterization and risk monotonization Pratik Patil, Jin-Hong Du, Arun Kumar Kuchibhotla ; (319):1−113, 2023. [ abs ][ pdf ][ bib ]

Higher-Order Spectral Clustering Under Superimposed Stochastic Block Models Subhadeep Paul, Olgica Milenkovic, Yuguo Chen ; (320):1−58, 2023. [ abs ][ pdf ][ bib ]

Scale Invariant Power Iteration Cheolmin Kim, Youngseok Kim, Diego Klabjan ; (321):1−47, 2023. [ abs ][ pdf ][ bib ] [ code ]

Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks Simge Kucukyavuz, Ali Shojaie, Hasan Manzour, Linchuan Wei, Hao-Hsiang Wu ; (322):1−38, 2023. [ abs ][ pdf ][ bib ]

Semi-Supervised Off-Policy Reinforcement Learning and Value Estimation for Dynamic Treatment Regimes Aaron Sonabend-W, Nilanjana Laha, Ashwin N. Ananthakrishnan, Tianxi Cai, Rajarshi Mukherjee ; (323):1−86, 2023. [ abs ][ pdf ][ bib ] [ code ]

Be More Active! Understanding the Differences Between Mean and Sampled Representations of Variational Autoencoders Lisa Bonheme, Marek Grzes ; (324):1−30, 2023. [ abs ][ pdf ][ bib ] [ code ]

ProtoShotXAI: Using Prototypical Few-Shot Architecture for Explainable AI Samuel Hess, Gregory Ditzler ; (325):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

Benign Overfitting of Constant-Stepsize SGD for Linear Regression Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Sham M. Kakade ; (326):1−58, 2023. [ abs ][ pdf ][ bib ]

Reproducing Kernels and New Approaches in Compositional Data Analysis Binglin Li, Changwon Yoon, Jeongyoun Ahn ; (327):1−34, 2023. [ abs ][ pdf ][ bib ]

Bandit problems with fidelity rewards Gábor Lugosi, Ciara Pike-Burke, Pierre-André Savalle ; (328):1−44, 2023. [ abs ][ pdf ][ bib ]

Mini-batching error and adaptive Langevin dynamics Inass Sekkat, Gabriel Stoltz ; (329):1−58, 2023. [ abs ][ pdf ][ bib ]

The Power of Contrast for Feature Learning: A Theoretical Analysis Wenlong Ji, Zhun Deng, Ryumei Nakada, James Zou, Linjun Zhang ; (330):1−78, 2023. [ abs ][ pdf ][ bib ]

Fair Data Representation for Machine Learning at the Pareto Frontier Shizhou Xu, Thomas Strohmer ; (331):1−63, 2023. [ abs ][ pdf ][ bib ] [ code ]

Learning Conditional Generative Models for Phase Retrieval Tobias Uelwer, Sebastian Konietzny, Alexander Oberstrass, Stefan Harmeling ; (332):1−28, 2023. [ abs ][ pdf ][ bib ]

Weisfeiler and Leman go Machine Learning: The Story so far Christopher Morris, Yaron Lipman, Haggai Maron, Bastian Rieck, Nils M. Kriege, Martin Grohe, Matthias Fey, Karsten Borgwardt ; (333):1−59, 2023. [ abs ][ pdf ][ bib ]

Dimensionality Reduction and Wasserstein Stability for Kernel Regression Stephan Eckstein, Armin Iske, Mathias Trabs ; (334):1−35, 2023. [ abs ][ pdf ][ bib ]

T-Cal: An Optimal Test for the Calibration of Predictive Models Donghwan Lee, Xinmeng Huang, Hamed Hassani, Edgar Dobriban ; (335):1−72, 2023. [ abs ][ pdf ][ bib ] [ code ]

Finite-time Koopman Identifier: A Unified Batch-online Learning Framework for Joint Learning of Koopman Structure and Parameters Majid Mazouchi, Subramanya Nageshrao, Hamidreza Modares ; (336):1−35, 2023. [ abs ][ pdf ][ bib ]

The Art of BART: Minimax Optimality over Nonhomogeneous Smoothness in High Dimension Seonghyun Jeong, Veronika Rockova ; (337):1−65, 2023. [ abs ][ pdf ][ bib ]

Community Recovery in the Geometric Block Model Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha ; (338):1−53, 2023. [ abs ][ pdf ][ bib ]

Compression, Generalization and Learning Marco C. Campi, Simone Garatti ; (339):1−74, 2023. [ abs ][ pdf ][ bib ]

Topological Hidden Markov Models Adam B Kashlak, Prachi Loliencar, Giseon Heo ; (340):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Bayesian Bradley-Terry model to compare multiple ML algorithms on multiple data sets Jacques Wainer ; (341):1−34, 2023. [ abs ][ pdf ][ bib ] [ code ]

The Geometry and Calculus of Losses Robert C. Williamson, Zac Cranko ; (342):1−72, 2023. [ abs ][ pdf ][ bib ]

Accelerated Primal-Dual Mirror Dynamics for Centralized and Distributed Constrained Convex Optimization Problems You Zhao, Xiaofeng Liao, Xing He, Mingliang Zhou, Chaojie Li ; (343):1−59, 2023. [ abs ][ pdf ][ bib ]

Large data limit of the MBO scheme for data clustering: convergence of the dynamics Tim Laux, Jona Lelmi ; (344):1−49, 2023. [ abs ][ pdf ][ bib ]

Radial Basis Approximation of Tensor Fields on Manifolds: From Operator Estimation to Manifold Learning John Harlim, Shixiao Willing Jiang, John Wilson Peoples ; (345):1−85, 2023. [ abs ][ pdf ][ bib ]

Linear Partial Monitoring for Sequential Decision Making: Algorithms, Regret Bounds and Applications Johannes Kirschner, Tor Lattimore, Andreas Krause ; (346):1−45, 2023. [ abs ][ pdf ][ bib ]

Implicit Regularization and Entrywise Convergence of Riemannian Optimization for Low Tucker-Rank Tensor Completion Haifeng Wang, Jinchi Chen, Ke Wei ; (347):1−84, 2023. [ abs ][ pdf ][ bib ]

Conformal Frequency Estimation using Discrete Sketched Data with Coverage for Distinct Queries Matteo Sesia, Stefano Favaro, Edgar Dobriban ; (348):1−80, 2023. [ abs ][ pdf ][ bib ] [ code ]

Instance-Dependent Generalization Bounds via Optimal Transport Songyan Hou, Parnian Kassraie, Anastasis Kratsios, Andreas Krause, Jonas Rothfuss ; (349):1−51, 2023. [ abs ][ pdf ][ bib ]

Robust High-Dimensional Low-Rank Matrix Estimation: Optimal Rate and Data-Adaptive Tuning Xiaolong Cui, Lei Shi, Wei Zhong, Changliang Zou ; (350):1−57, 2023. [ abs ][ pdf ][ bib ]

Modular Regression: Improving Linear Models by Incorporating Auxiliary Data Ying Jin, Dominik Rothenhäusler ; (351):1−52, 2023. [ abs ][ pdf ][ bib ]

Group SLOPE Penalized Low-Rank Tensor Regression Yang Chen, Ziyan Luo ; (352):1−30, 2023. [ abs ][ pdf ][ bib ]

Limitations on approximation by deep and shallow neural networks Guergana Petrova, Przemyslaw Wojtaszczyk ; (353):1−38, 2023. [ abs ][ pdf ][ bib ]

A Unified Experiment Design Approach for Cyclic and Acyclic Causal Models Ehsan Mokhtarian, Saber Salehkaleybar, AmirEmad Ghassami, Negar Kiyavash ; (354):1−31, 2023. [ abs ][ pdf ][ bib ] [ code ]

Beyond Spectral Gap: The Role of the Topology in Decentralized Learning Thijs Vogels, Hadrien Hendrikx, Martin Jaggi ; (355):1−31, 2023. [ abs ][ pdf ][ bib ] [ code ]

MAUVE Scores for Generative Models: Theory and Practice Krishna Pillutla, Lang Liu, John Thickstun, Sean Welleck, Swabha Swayamdipta, Rowan Zellers, Sewoong Oh, Yejin Choi, Zaid Harchaoui ; (356):1−92, 2023. [ abs ][ pdf ][ bib ] [ code ]

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces Jonathan W. Siegel ; (357):1−52, 2023. [ abs ][ pdf ][ bib ]

Optimal Parameter-Transfer Learning by Semiparametric Model Averaging Xiaonan Hu, Xinyu Zhang ; (358):1−53, 2023. [ abs ][ pdf ][ bib ]

A Unified Theory of Diversity in Ensemble Learning Danny Wood, Tingting Mu, Andrew M. Webb, Henry W. J. Reeve, Mikel Luján, Gavin Brown ; (359):1−49, 2023. [ abs ][ pdf ][ bib ] [ code ]

Attribution-based Explanations that Provide Recourse Cannot be Robust Hidde Fokkema, Rianne de Heide, Tim van Erven ; (360):1−37, 2023. [ abs ][ pdf ][ bib ] [ code ]

Differentially Private Hypothesis Testing for Linear Regression Daniel G. Alabi, Salil P. Vadhan ; (361):1−50, 2023. [ abs ][ pdf ][ bib ]

Discovering Salient Neurons in deep NLP models Nadir Durrani, Fahim Dalvi, Hassan Sajjad ; (362):1−40, 2023. [ abs ][ pdf ][ bib ] [ code ]

Avalanche: A PyTorch Library for Deep Continual Learning Antonio Carta, Lorenzo Pellegrini, Andrea Cossu, Hamed Hemati, Vincenzo Lomonaco ; (363):1−6, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Partial Order in Chaos: Consensus on Feature Attributions in the Rashomon Set Gabriel Laberge, Yann Pequignot, Alexandre Mathieu, Foutse Khomh, Mario Marchand ; (364):1−50, 2023. [ abs ][ pdf ][ bib ] [ code ]

Hard-Constrained Deep Learning for Climate Downscaling Paula Harder, Alex Hernandez-Garcia, Venkatesh Ramesh, Qidong Yang, Prasanna Sattegeri, Daniela Szwarcman, Campbell Watson, David Rolnick ; (365):1−40, 2023. [ abs ][ pdf ][ bib ] [ code ]

Confidence and Uncertainty Assessment for Distributional Random Forests Jeffrey Näf, Corinne Emmenegger, Peter Bühlmann, Nicolai Meinshausen ; (366):1−77, 2023. [ abs ][ pdf ][ bib ] [ code ]

TorchOpt: An Efficient Library for Differentiable Optimization Jie Ren*, Xidong Feng*, Bo Liu*, Xuehai Pan*, Yao Fu, Luo Mai, Yaodong Yang ; (367):1−14, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

LapGym - An Open Source Framework for Reinforcement Learning in Robot-Assisted Laparoscopic Surgery Paul Maria Scheikl, Balázs Gyenes, Rayan Younis, Christoph Haas, Gerhard Neumann, Martin Wagner, Franziska Mathis-Ullrich ; (368):1−42, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Permutation-Free Kernel Independence Test Shubhanshu Shekhar, Ilmun Kim, Aaditya Ramdas ; (369):1−68, 2023. [ abs ][ pdf ][ bib ] [ code ]

Densely Connected G-invariant Deep Neural Networks with Signed Permutation Representations Devanshu Agrawal, James Ostrowski ; (370):1−40, 2023. [ abs ][ pdf ][ bib ] [ code ]

Decentralized Robust V-learning for Solving Markov Games with Model Uncertainty Shaocong Ma, Ziyi Chen, Shaofeng Zou, Yi Zhou ; (371):1−40, 2023. [ abs ][ pdf ][ bib ]

A Unified Recipe for Deriving (Time-Uniform) PAC-Bayes Bounds Ben Chugg, Hongjian Wang, Aaditya Ramdas ; (372):1−61, 2023. [ abs ][ pdf ][ bib ]

Multilevel CNNs for Parametric PDEs Cosmas Heiß, Ingo Gühring, Martin Eigel ; (373):1−42, 2023. [ abs ][ pdf ][ bib ]

Diffusion Bridge Mixture Transports, Schrödinger Bridge Problems and Generative Modeling Stefano Peluchetti ; (374):1−51, 2023. [ abs ][ pdf ][ bib ] [ code ]

Set-valued Classification with Out-of-distribution Detection for Many Classes Zhou Wang, Xingye Qiao ; (375):1−39, 2023. [ abs ][ pdf ][ bib ] [ code ]

On the Dynamics Under the Unhinged Loss and Beyond Xiong Zhou, Xianming Liu, Hanzhang Wang, Deming Zhai, Jiangjunjun, Xiangyang Ji ; (376):1−62, 2023. [ abs ][ pdf ][ bib ]

Scaling Up Models and Data with t5x and seqio Adam Roberts, Hyung Won Chung, Gaurav Mishra, Anselm Levskaya, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Kehang Han, Michelle Casbon, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, Andrea Gesmundo ; (377):1−8, 2023. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ] [ code ]

Principled Out-of-Distribution Detection via Multiple Testing Akshayaa Magesh, Venugopal V. Veeravalli, Anirban Roy, Susmit Jha ; (378):1−35, 2023. [ abs ][ pdf ][ bib ]

On Learning Rates and Schrödinger Operators Bin Shi, Weijie Su, Michael I. Jordan ; (379):1−53, 2023. [ abs ][ pdf ][ bib ]

Randomized Spectral Co-Clustering for Large-Scale Directed Networks Xiao Guo, Yixuan Qiu, Hai Zhang, Xiangyu Chang ; (380):1−68, 2023. [ abs ][ pdf ][ bib ] [ code ]

Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence Yuetian Luo, Anru R. Zhang ; (381):1−48, 2023. [ abs ][ pdf ][ bib ] [ code ]

A Novel Integer Linear Programming Approach for Global L0 Minimization Diego Delle Donne, Matthieu Kowalski, Leo Liberti ; (382):1−28, 2023. [ abs ][ pdf ][ bib ]

Over-parameterized Deep Nonparametric Regression for Dependent Data with Its Applications to Reinforcement Learning Xingdong Feng, Yuling Jiao, Lican Kang, Baqun Zhang, Fan Zhou ; (383):1−40, 2023. [ abs ][ pdf ][ bib ]

On Unbalanced Optimal Transport: Gradient Methods, Sparsity and Approximation Error Quang Minh Nguyen, Hoang H. Nguyen, Yi Zhou, Lam M. Nguyen ; (384):1−41, 2023. [ abs ][ pdf ][ bib ]

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang ; (385):1−43, 2023. [ abs ][ pdf ][ bib ]

Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice Jonas Rothfuss, Martin Josifoski, Vincent Fortuin, Andreas Krause ; (386):1−62, 2023. [ abs ][ pdf ][ bib ]

Distributed Statistical Inference under Heterogeneity Jia Gu, Song Xi Chen ; (387):1−57, 2023. [ abs ][ pdf ][ bib ]

Fourier Neural Operator with Learned Deformations for PDEs on General Geometries Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, Anima Anandkumar ; (388):1−26, 2023. [ abs ][ pdf ][ bib ] [ code ]

Semiparametric Inference Using Fractional Posteriors Alice L'Huillier, Luke Travis, Ismaël Castillo, Kolyan Ray ; (389):1−61, 2023. [ abs ][ pdf ][ bib ]

A Scalable and Efficient Iterative Method for Copying Machine Learning Classifiers Nahuel Statuto, Irene Unceta, Jordi Nin, Oriol Pujol ; (390):1−34, 2023. [ abs ][ pdf ][ bib ]

Hierarchical Kernels in Deep Kernel Learning Wentao Huang, Houbao Lu, Haizhang Zhang ; (391):1−30, 2023. [ abs ][ pdf ][ bib ] [ code ]

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning Eric Xia, Koulik Khamaru, Martin J. Wainwright, Michael I. Jordan ; (392):1−43, 2023. [ abs ][ pdf ][ bib ]

A Unified Approach to Controlling Implicit Regularization via Mirror Descent Haoyuan Sun, Khashayar Gatmiry, Kwangjun Ahn, Navid Azizan ; (393):1−58, 2023. [ abs ][ pdf ][ bib ]

Revisiting inference after prediction Keshav Motwani, Daniela Witten ; (394):1−18, 2023. [ abs ][ pdf ][ bib ] [ code ]

Adaptive Learning of Density Ratios in RKHS Werner Zellinger, Stefan Kindermann, Sergei V. Pereverzyev ; (395):1−28, 2023. [ abs ][ pdf ][ bib ]

RVCL: Evaluating the Robustness of Contrastive Learning via Verification Zekai Wang, Weiwei Liu ; (396):1−43, 2023. [ abs ][ pdf ][ bib ] [ code ]

Bayesian Spanning Tree: Estimating the Backbone of the Dependence Graph Leo L. Duan, David B. Dunson ; (397):1−44, 2023. [ abs ][ pdf ][ bib ] [ code ]

Finding Groups of Cross-Correlated Features in Bi-View Data Miheer Dewaskar, John Palowitch, Mark He, Michael I. Love, Andrew B. Nobel ; (398):1−47, 2023. [ abs ][ pdf ][ bib ] [ code ]

Boosting Multi-agent Reinforcement Learning via Contextual Prompting Yue Deng, Zirui Wang, Xi Chen, Yin Zhang ; (399):1−34, 2023. [ abs ][ pdf ][ bib ]

Foundation Models and Fair Use Peter Henderson, Xuechen Li, Dan Jurafsky, Tatsunori Hashimoto, Mark A. Lemley, Percy Liang ; (400):1−79, 2023. [ abs ][ pdf ][ bib ]

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests
US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support
About IEEE Xplore
Accessibility
Terms of Use
Nondiscrimination Policy
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Grab your spot at the free arXiv Accessibility Forum

Help | Advanced Search

Computer Science > Machine Learning

Title: machine unlearning in generative ai: a survey.

Abstract: Generative AI technologies have been deployed in many places, such as (multimodal) large language models and vision generative models. Their remarkable performance should be attributed to massive training data and emergent reasoning abilities. However, the models would memorize and generate sensitive, biased, or dangerous information originated from the training data especially those from web crawl. New machine unlearning (MU) techniques are being developed to reduce or eliminate undesirable knowledge and its effects from the models, because those that were designed for traditional classification tasks could not be applied for Generative AI. We offer a comprehensive survey on many things about MU in Generative AI, such as a new problem formulation, evaluation methods, and a structured discussion on the advantages and limitations of different kinds of MU techniques. It also presents several critical challenges and promising directions in MU research. A curated list of readings can be found: this https URL .

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	[cs.LG]
	(or [cs.LG] for this version)
	Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

HTML (experimental)
Other Formats

References & Citations

Google Scholar
Semantic Scholar

BibTeX formatted citation

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

The Impact of Machine Learning Algorithms on Academic Research Papers

Machine learning algorithms have become increasingly prevalent in the field of academic research, revolutionizing the way researchers approach and analyze data. These advanced computational techniques have the potential to uncover patterns, make predictions, and generate insights that were previously unattainable through traditional research methods. As the volume and complexity of data continue to grow, the integration of machine learning algorithms has become a crucial component in the pursuit of groundbreaking discoveries and advancements in various academic disciplines.

In this article, we will explore the impact of machine learning algorithms on academic research papers, delving into the advantages, common algorithms used, and the challenges faced by researchers in leveraging these powerful tools. Machine learning algorithms are becoming increasingly common in academic research, revolutionizing the way researchers approach data analysis. These advanced computational methods have the potential to reveal patterns, predict, and generate insights that were previously unavailable through traditional research methods.

In this article, we explore the impact of machine learning algorithms on academic research, delving into the benefits, typical algorithms used, and challenges researchers face when using these powerful tools. Thus, you will learn that with such a request as, for example, “ Write my nursing essay for me “, you can turn not only to specialized writing services but also to AI assistants to receive comprehensive and important information.

Table of Contents

Advantages of Using Machine Learning Algorithms in Academic Research

The integration of machine learning algorithms in academic research has brought forth numerous benefits, including:

Enhanced Efficiency and Productivity: Machine learning algorithms can automate repetitive tasks, such as data preprocessing, feature extraction, and model training, allowing researchers to focus on higher-level analysis and interpretation. This increased efficiency can lead to faster turnaround times for research projects and more productive use of researchers’ time.
Improved Accuracy and Predictive Power: Machine learning algorithms are adept at identifying complex patterns and relationships within large, multidimensional datasets. By leveraging these algorithms, researchers can uncover insights and make predictions that would be challenging or impossible to achieve through traditional statistical methods.
Exploratory Data Analysis and Hypothesis Generation: Machine learning algorithms can be used for exploratory data analysis, helping researchers identify previously unknown trends, anomalies, and potential areas of investigation. This can lead to the formulation of new hypotheses and the exploration of novel research directions.
Scalability and Adaptability: As the volume and complexity of data continue to grow, machine learning algorithms can handle and process large datasets with ease, making them a valuable tool for researchers working with big data. Additionally, these algorithms can adapt to changing data patterns and evolve as new information becomes available.

Common Machine Learning Algorithms Used in Academic Research

Researchers in various academic fields have embraced a wide range of machine-learning algorithms to enhance their research efforts. Some of the most commonly used algorithms include:

Linear Regression
Logistic Regression
Decision Trees
Support Vector Machines (SVMs)
Artificial Neural Networks
K-Means Clustering
Hierarchical Clustering
Principal Component Analysis (PCA)
Anomaly Detection
Text Classification
Sentiment Analysis
Named Entity Recognition
Topic Modeling
Image Classification
Object Detection
Semantic Segmentation
Image Captioning

The choice of algorithm depends on the specific research question, the nature of the data, and the desired outcomes. Researchers often experiment with multiple algorithms and compare their performance to identify the most suitable approach for their research objectives.

How Machine Learning Algorithms Improve Efficiency and Accuracy in Academic Research

The integration of machine learning algorithms in academic research has significantly enhanced efficiency and accuracy in various ways:

Automated Data Processing: Machine learning algorithms can automate the tedious and time-consuming tasks of data cleaning, feature engineering, and data transformation, allowing researchers to focus on the more complex and analytical aspects of their research.
Predictive Modeling: By leveraging machine learning algorithms, researchers can develop predictive models that can accurately forecast outcomes, identify patterns, and make data-driven decisions. This can lead to more reliable and impactful research findings.
Pattern Recognition: Machine learning algorithms excel at identifying complex patterns and relationships within large, multidimensional datasets. This enables researchers to uncover insights that may have been overlooked using traditional analysis methods.
Anomaly Detection: Machine learning algorithms can detect anomalies, outliers, and unusual data points, which can be crucial in identifying potential errors, experimental flaws, or unexpected phenomena worthy of further investigation.
Automated Evaluation and Optimization: Machine learning algorithms can be used to automate the evaluation and optimization of research models, helping researchers identify the most effective approaches and refine their methodologies.

Challenges and Limitations of Using Machine Learning Algorithms in Academic Research Papers

While the benefits of using machine learning algorithms in academic research are substantial, there are also several challenges and limitations that researchers must consider:

Data Quality and Availability: The performance of machine learning algorithms is heavily dependent on the quality and quantity of the available data. Researchers must ensure that their datasets are comprehensive, representative, and free from biases or errors.
Interpretability and Explainability: Some machine learning algorithms, such as deep neural networks, can be inherently complex and difficult to interpret. Researchers must balance the predictive power of these algorithms with the need for transparency and the ability to explain their findings to the academic community.
Ethical Considerations: The use of machine learning algorithms in research raises ethical concerns, such as data privacy, algorithmic bias, and the potential misuse of findings. Researchers must carefully consider these ethical implications and adhere to established guidelines and best practices.
Computational Resources: Leveraging advanced machine learning algorithms can be computationally intensive, requiring significant computing power, memory, and storage. Researchers may face challenges in accessing and managing the necessary computational resources, especially in resource-constrained environments.
Interdisciplinary Collaboration: Effectively integrating machine learning algorithms into academic research often requires collaboration between domain experts and data science professionals. Bridging the gap between these disciplines can be a significant challenge, requiring effective communication and a shared understanding of the research objectives.

Researchers interested in incorporating machine learning algorithms into their academic work should consider exploring the resources and support available through their institution’s research computing or data science centers. These centers often provide training, consulting, and access to the necessary computational resources to help researchers navigate the complexities of machine learning and open its full potential.

How to find the best online data analytics bootcamp, can rasp security like appsealing ensure effective runtime application self-protection, junk files be gone: reclaim your iphone’s memory, cypress vs selenium: which testing framework is right for you.

Type above and press Enter to search. Press Esc to cancel.

Apple Intelligence Foundation Language Models

View publication

Copy Bibtex

We present foundation language models developed to power Apple Intelligence features, including a ∼3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used to train the model, the training process, how the models are optimized for inference, and the evaluation results. We highlight our focus on Responsible AI and how the principles are applied throughout the model development.

This paper provides technical details for Apple’s On-Device and Server Foundation Models, introduced on June 10, 2024, in this post .

Apple Natural Language Understanding Workshop 2023

Earlier this year, Apple hosted the Natural Language Understanding workshop. This two-day hybrid event brought together Apple and members of the academic research community for talks and discussions on the state of the art in natural language understanding.

In this post, we share highlights from workshop discussions and recordings of select workshop talks.

Discover opportunities in Machine Learning.

Our research in machine learning breaks new ground every day.

Work with us

Search form

In probing brain-behavior nexus, big datasets are better.

Metallic sculpture of human brain in front of a large language model illustration

(AI-generated image and Adobe Stock, created and edited by Michael S. Helfenbein)

When designing machine learning models, researchers first train the models to recognize data patterns and then test their effectiveness. But if the datasets used to train and test aren’t sufficiently large, models may appear to be less capable than they actually are, a new Yale study reports.

When it comes to models that identify patterns between the brain and behavior, this could have implications for future research, contribute to the replication crisis affecting psychological research, and hamper understanding of the human brain, researchers say.

The findings were published July 31 in the journal Nature Human Behavior.

Researchers increasingly use machine learning models to uncover patterns that link brain structure or function to, say, cognitive attributes like attention or symptoms of depression. Making these links allows researchers to better understand how the brain contributes to these attributes (and vice versa) and potentially enables them to predict who might be at risk for certain cognitive challenges based on brain imaging alone.

But models are only useful if they’re accurate across the general population, not just among the people included in the training data.

Often, researchers will split one dataset into a larger portion on which they train the model and a smaller portion used to test the model’s ability (since collecting two separate sets of data requires greater resources). A growing number of studies, however, have subjected machine learning models to a more rigorous test in order to evaluate their generalizability, testing them on an entirely different dataset made available by other researchers.

“ And that’s good,” said Matthew Rosenblatt, lead author of the study and a graduate student in the lab of Dustin Scheinost , associate professor of radiology and biomedical imaging at Yale School of Medicine. “If you can show something works in a totally different dataset, then it’s probably a robust brain-behavior relationship.”

Adding another dataset into the mix, however, comes with its own complications — namely, in regard to a study’s “power.” Statistical power is the probability that a research study will detect an effect if one exists. For example, a child’s height is closely related to their age. If a study is adequately powered, then that relationship will be observed. If the study is “low-powered,” on the other hand, there’s a higher risk of overlooking the link between age and height.

There are two important aspects to statistical power — the size of the dataset (also known as the sample size) and the effect size. And the smaller that one of those aspects is, the larger the other needs to be. The link between age and height is strong, meaning the effect size is large; one can observe that relationship in even a small dataset. But when the relationship between two factors is more subtle — like, say, age and how well one can sense through touch — researchers would need to collect data from more people to uncover that connection.

While there are equations that can calculate how big a dataset should be to achieve enough power, there aren’t any to easily calculate how large two datasets — one training and one testing — should be.

To understand how training and testing dataset sizes affect study power, researchers in the new study used data from six neuroimaging studies and resampled that data over and over, changing the dataset sizes to see how that affected statistical power.

“ We showed that statistical power requires relatively large sample sizes for both training and external testing datasets,” said Rosenblatt. “When we looked at published studies in the field that use this approach — testing models on a second dataset — we found most of their datasets were too small, underpowering their studies.”

Among already published studies, the researchers found that the median sizes for training and testing datasets were 129 and 108 participants, respectively. For measures with large effect sizes, like age, those dataset sizes were big enough to achieve adequate power. But for measures with medium effect sizes, such as working memory, datasets of those sizes resulted in a 51% chance that the study would not detect a relationship between brain structure and the measure; for measures with low effect sizes, like attention problems, those odds increased to 91%.

“ For these measures with smaller effect sizes, researchers may need datasets of hundreds to thousands of people,” said Rosenblatt.

As more neuroimaging datasets become available, Rosenblatt and his colleagues expect more researchers will opt to test their models on separate datasets.

“ That’s a move in the right direction,” said Scheinost. “Especially with reproducibility being the problem it is, validating a model on a second, external dataset is one solution. But we want people to think about their dataset sizes. Researchers must do what they can with the data they have, but as more data becomes available, we should all aim to test externally and make sure those test datasets are large.”

Health & Medicine

Media Contact

Fred Mamoun: [email protected] , 203-436-2643

How physiatry can help people with stiff person syndrome

For some Black, Latino, and Asian people, summers are hotter 

Asking the right questions: Yale mathematician pushes for better AI science

Addressing behavioral health in patients with end-stage liver disease

Show More Articles

Research Paper Classification Using Machine and Deep Learning Techniques

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, index terms.

Computing methodologies

Machine learning

Machine learning approaches

Classification and regression trees

Recommendations

Deep learning--based text classification: a comprehensive review.

Deep learning--based models have surpassed classical machine learning--based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference. In this article, we ...

Intrusion Detection Using Big Data and Deep Learning Techniques

In this paper, Big Data and Deep Learning Techniques are integrated to improve the performance of intrusion detection systems. Three classifiers are used to classify network traffic datasets, and these are Deep Feed-Forward Neural Network (DNN) and two ...

Boosting to correct inductive bias in text classification

This paper studies the effects of boosting in the context of different classification methods for text categorization, including Decision Trees, Naive Bayes, Support Vector Machines (SVMs) and a Rocchio-style classifier. We identify the inductive biases ...

Information

Published in.

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

deep learning
gradient-boosted trees
machine learning
text classification
topic classification
Research-article
Refereed limited

Contributors

Other metrics, bibliometrics, article metrics.

0 Total Citations
2 Total Downloads
Downloads (Last 12 months) 2
Downloads (Last 6 weeks) 2

View Options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

HTML Format

View this article in HTML Format.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
Download citation
Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Identification of Dominant Phenomena and Analysis of Coastal Berm Changes Under Non-Storm Conditions Using Machine Learning

37 Pages Posted: 3 Aug 2024

Soheil Ataei H.

Shahrood University of Technology

Mehdi Adjami

Saeid gharechelou.

Machine learning advances, establishing research sites, and collecting environmental and field data in coastal regions provide significant information that can be exploited to achieve coastal engineering goals. This study attempts to analyze Narrabeen Beach from a different aspect using ML. One of the most important factors in analyzing the performance of a beach in terms of its geometric shaping is to examine it under normal and non-stormy conditions, which can be evaluated using the concept of beach equilibrium behavior. It is crucial to recognize and explain factors that impact the balance of a beach. In this research, shoreline changes and then coastal berm geometry changes (as objective functions) were evaluated. The objective functions were analyzed and described using the regression decision tree algorithm. The optimal behavior pattern was evaluated using the error and complexity factors of the model, and relevant scenarios were selected to describe the parameters affecting them. ∆BW, Berm Slope, SLR, and ζ were selected for the description of shoreline changes, with R2=82% and RMSE=3.489 m, BC height, ∆x shoreline, ∆x BC, and P were used for coastal berm height change description, with R2=48% and RMSE=0.397 m; and BW, berm slope, ∆y BC, BC height, E, and SLR were selected for describing the horizontal position of the berm crest, with R2=67% and RMSE=9.807 m. Finally, in order to verify the performance of the descriptive algorithm and its results, the objective functions were predicted using the artificial neural network (ANN) algorithm and the parameters selected from the previous section. For shoreline changes, the maximum R2 was 92.01% and the minimum MSE was 9.30, while for berm height changes, the maximum R2 was 77.43% and the minimum MSE was 0.140. Lastly, for the horizontal position of the berm crest, the maximum R2 was 79.83% and the minimum MSE was 120.73.

Keywords: Beach Description, Coastal Berm, machine learning, Non-Storm Conditions, Regression Decision Tree, Shoreface

Suggested Citation: Suggested Citation

Shahrood University of Technology ( email )

Shahrood Iran

Mehdi Adjami (Contact Author)

Do you have a job opening that you would like to promote on ssrn, paper statistics, related ejournals, artificial intelligence ejournal.

Subscribe to this fee journal for more curated articles on this topic

Computational Earth Science eJournal

Subscribe to this free journal for more curated articles on this topic

Earth-Surface Processes eJournal

Geomorphology & sedimentary geology ejournal.

share this!

August 6, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

trusted source

Researchers identify over 2,000 potential toxins using machine learning

by Hebrew University of Jerusalem

Identification of novel toxins using machine learning

In a novel study, researchers have unveiled new secrets about a fascinating bacterial weapon system that acts like a microscopic syringe. The research paper, titled " Identification of novel toxins associated with the extracellular contractile injection system using machine learning " is published in Molecular Systems Biology

Led by Dr. Asaf Levy from the Hebrew University and collaborators from the Hebrew University and from the University of Illinois Urbana-Champaine, the team has made significant strides in understanding the extracellular contractile injection system (eCIS), a unique mechanism used by bacteria and archaea to inject toxins into other organisms.

Cracking the bacterial code with AI

The eCIS is a 100-nanometer long weapon that evolved from viruses that previously attacked microbes (phages). During evolution, these viruses lost their ability to infect microbes and turned into syringes that inject toxins into different organisms, such as insects.

Previously, the Levy group identified eCIS as a weapon carried by more than 1,000 microbial species. Interestingly, these microbes rarely attack humans, and the eCIS role in nature remains mostly unknown . However, we know that it loads and injects protein toxins.

The specific proteins injected by eCIS and their functions have long remained a mystery. Before the study we knew about ~20 toxins that eCIS can load and inject.

To solve this biological puzzle, the research team developed an innovative machine learning tool that combines genetic and biochemical data of different genes and proteins to accurately identify these elusive toxins. The project resulted in identification of over 2,000 potential toxin proteins.

"Our discovery not only sheds light on how microbes interact with their hosts and maybe with each other, but also demonstrates the power of machine learning in uncovering new gene functions," explains Dr. Levy. "This could open up new avenues for developing antimicrobial treatments or novel biotechnological tools."

New toxins with enzymatic activities against different molecules

Using AI technology, the researchers analyzed 950 microbial genomes and identified an impressive 2,194 potential toxins. Among these, four new toxins (named EAT14-17) were experimentally validated by demonstrating that they can inhibit growth of bacteria or yeast cells .

Remarkably, one of these toxins, EAT14, was found to inhibit cell signaling in human cells, showcasing its potential impact on human health. The group showed that the new toxins likely act as enzymes that damage the target cells by targeting proteins, DNA or a molecule that is critical to energy metabolism. Moreover, the group was able to decipher the protein sequence code that allows loading of toxins into the eCIS syringe.

Recently, it was demonstrated that eCIS can be used as a programmable syringe that can be engineered for injection into various cell types, including brain cells. The new findings from the current paper leverage this ability by providing thousands of toxins that are naturally injected by eCIS and the code that facilitates their loading into the eCIS syringe. The code can be transferred into other proteins of interest.

From microscopic battles to medical breakthroughs

The study's findings could have far-reaching applications in medicine, agriculture, and biotechnology. The newly identified toxins might be used to develop new antibiotics or pesticides, efficient enzyme for different industries, or to engineer microbes that can target specific pathogens.

This research highlights the incredible potential of combining biology with artificial intelligence to solve complex problems that could ultimately benefit human health.

"We're essentially deciphering the weapons that bacteria evolved and keep evolving to compete over resources in nature," adds Dr. Levy. "Microbes are creative inventors and it is fulfilling to be part of a group that discovers these amazing and surprising inventions."

The study was led by two students: Aleks Danov and Inbal Pollin from the department of Plant Pathology and Microbiology, the Institute of Environmental Sciences.

Journal information: Molecular Systems Biology

Provided by Hebrew University of Jerusalem

Explore further

Feedback to editors

Fishing disrupts squaretail grouper mating behavior, study finds

Domestication causes smaller brain size in dogs than in the wolf: Study challenges notion

4 hours ago

Tundra vegetation to grow taller, greener through 2100, study finds

6 hours ago

Living with a killer: How an unlikely mantis shrimp-clam association violates a biological principle

8 hours ago

Bouncing helps people move in sync during dance, study shows

How plants become bushy, or not: New study sheds light on hormone that controls branching

Elephants on the move: Mapping connections across African landscapes

Study finds seasonal shifts in moral values

10 hours ago

Researchers reveal atomic-scale details of catalysts' active sites

Sniff test for explosives detection extends its reach

Relevant physicsforums posts, contradictory statements made by two different professors about iq scores.

Aug 2, 2024

New and Interesting Publications Relevant to the Origin of Life

The cass report (uk).

Jul 30, 2024

The predictive brain (Stimulus-Specific Error Prediction Neurons)

Jul 21, 2024

Understanding COVID Quarantine Guidance

Jul 19, 2024

Innovative ideas and technologies to help folks with disabilities

Jul 18, 2024

Bacterial injection system delivers proteins in mice and human cells

Mar 29, 2023

Study highlights pathoblockers as a future alternative to antibiotics

May 16, 2024

Bacteria of the shield bug protect the insect by degrading plant toxins

Jul 1, 2024

New class of antimicrobials discovered in soil bacteria

Apr 17, 2024

Researchers discover molecular 'barcode' used by bacteria to secrete toxins

Jan 8, 2024

Researchers show how toxins of the bacterium Clostridium difficile get into gut cells

Sep 12, 2018

Recommended for you

Competition over millions of years preserves genetic diversity of three crustaceans

11 hours ago

Researchers find book scorpion venom effective against hospital germs

15 hours ago

Hunt for herbicide solution in snap bean reveals master switch for stress resistance

13 hours ago

Researchers identify gene responsible for marsupial fur color

Male poison frogs may use finger placement to channel pheromones to females while mating

14 hours ago

Comprehensive, first-ever soil virus dataset represents untapped viral diversity, biogeochemical potential

Aug 5, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

Frequently Asked Questions

JMLR Papers

Select a volume number to see its table of contents with links to the papers.

Volume 23 (January 2022 - Present)

Volume 22 (January 2021 - December 2021)

Volume 21 (January 2020 - December 2020)

Volume 20 (January 2019 - December 2019)

Volume 19 (August 2018 - December 2018)

Volume 18 (February 2017 - August 2018)

Volume 17 (January 2016 - January 2017)

Volume 16 (January 2015 - December 2015)

Volume 15 (January 2014 - December 2014)

Volume 14 (January 2013 - December 2013)

Volume 13 (January 2012 - December 2012)

Volume 12 (January 2011 - December 2011)

Volume 11 (January 2010 - December 2010)

Volume 10 (January 2009 - December 2009)

Volume 9 (January 2008 - December 2008)

Volume 8 (January 2007 - December 2007)

Volume 7 (January 2006 - December 2006)

Volume 6 (January 2005 - December 2005)

Volume 5 (December 2003 - December 2004)

Volume 4 (Apr 2003 - December 2003)

Volume 3 (Jul 2002 - Mar 2003)

Volume 2 (Oct 2001 - Mar 2002)

Volume 1 (Oct 2000 - Sep 2001)

Special Topics

Bayesian Optimization

Learning from Electronic Health Data (December 2016)

Gesture Recognition (May 2012 - present)

Large Scale Learning (Jul 2009 - present)

Mining and Learning with Graphs and Relations (February 2009 - present)

Grammar Induction, Representation of Language and Language Learning (Nov 2010 - Apr 2011)

Causality (Sep 2007 - May 2010)

Model Selection (Apr 2007 - Jul 2010)

Conference on Learning Theory 2005 (February 2007 - Jul 2007)

Machine Learning for Computer Security (December 2006)

Machine Learning and Large Scale Optimization (Jul 2006 - Oct 2006)

Approaches and Applications of Inductive Programming (February 2006 - Mar 2006)

Learning Theory (Jun 2004 - Aug 2004)

Special Issues

In Memory of Alexey Chervonenkis (Sep 2015)

Independent Components Analysis (December 2003)

Learning Theory (Oct 2003)

Inductive Logic Programming (Aug 2003)

Fusion of Domain Knowledge with Data for Decision Support (Jul 2003)

Variable and Feature Selection (Mar 2003)

Machine Learning Methods for Text and Images (February 2003)

Eighteenth International Conference on Machine Learning (ICML2001) (December 2002)

Computational Learning Theory (Nov 2002)

Shallow Parsing (Mar 2002)

Kernel Methods (December 2001)

IMAGES

(PDF) Machine Learning for Anomaly Detection: A Systematic Review
(PDF) An Overview of Artificial Intelligence and their Applications towards Machine Learning
machine learning research papers 2019 pdf
(PDF) A Research on Machine Learning Methods and Its Applications
GitHub
application of supervised learning

VIDEO

3 Where Can You Find Machine Learning Research Papers and Code
2024 Empowering Minds Through Data Science and Machine Learning Symposium: Jinferg Zhang PHD
you should still read Machine Learning research papers
How to search Computer Science research papers With Code?
Topological Machine Learning I: Features and Kernels
One big problem with machine learning

COMMENTS

The latest in Machine Learning
Explore the latest Machine Learning papers and code on various topics, such as visual language models, large language models, and segmentation. See the rankings, ratings, and publications of the most popular and relevant research.
Journal of Machine Learning Research
The Journal of Machine Learning Research (JMLR), established in 2000, provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. JMLR has a commitment to rigorous yet rapid reviewing. Final versions are published ...
Machine learning
Machine learning is the ability of a machine to improve its performance based on previous results. Machine learning methods enable computers to learn without being explicitly programmed and have ...
JMLR Papers
Browse the table of contents and links to the papers published in JMLR, a peer-reviewed journal covering all aspects of machine learning. Find papers by volume number, special topics, special issues, or conference proceedings.
The Journal of Machine Learning Research
The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning.JMLR seeks previously unpublished papers that contain:new algorithms with empirical, theoretical, psychological, or biological justification; experimental and/or theoretical studies yielding new insight into ...
The Journal of Machine Learning Research
Benjamin Recht. Article No.: 20, Pages 724-750. This paper provides elementary analyses of the regret and generalization of minimum-norm interpolating classifiers (MNIC). The MNIC is the function of smallest Reproducing Kernel Hilbert Space norm that perfectly interpolates a label pattern on a finite ...
Machine Learning: Algorithms, Real-World Applications and Research
A comprehensive review of machine learning techniques and their applications in various domains, such as cybersecurity, smart cities, healthcare, and more. Learn the principles, challenges, and potential research directions of different types of machine learning algorithms, such as supervised, unsupervised, semi-supervised, and reinforcement learning.
Home
Machine Learning is a peer-reviewed journal that publishes research on various learning methods and problems. It covers empirical studies, theoretical analysis, and applications of machine learning to real-world problems.
Journal of Machine Learning Research
The Journal of Machine Learning Research (JMLR), established in 2000, provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. News. 2022.02.18: New blog post: Retrospectives from 20 Years of JMLR . 2022.01.25 ...
The latest in Machine Learning
modelscope/agentscope • 25 Jul 2024. Recent advances in large language models (LLMs) have opened new avenues for applying multi-agent systems in very large-scale simulations. 4,149. 0.79 stars / hour. Paper. Code. Papers With Code highlights trending Machine Learning research and the code to implement it.
Journal of Machine Learning Research
MushroomRL: Simplifying Reinforcement Learning Research Carlo D'Eramo, Davide Tateo, Andrea Bonarini, Marcello Restelli, Jan Peters; (131):1−5, 2021. (Machine Learning Open Source Software Paper) Locally Differentially-Private Randomized Response for Discrete Distribution Learning
Machine Learning with Applications
Machine Learning with Applications (MLWA) is a peer reviewed, open access journal focused on research related to machine learning.The journal encompasses all aspects of research and development in ML, including but not limited to data mining, computer vision, natural language processing (NLP), intelligent systems, neural networks, AI-based software engineering, bioinformatics and their ...
Machine Learning
Explore the latest research papers on machine learning, including topics on falsifiable, replicable, and reprocible empirical ML research.
Top 20 Recent Research Papers on Machine Learning and Deep Learning
Machine learning, especially its subfield of Deep Learning, had many amazing advances in the recent years, and important research papers may lead to breakthroughs in technology that get used by billio ns of people. The research in this field is developing very quickly and to help our readers monitor the progress we present the list of most important recent scientific papers published since 2014.
Machine learning
A machine learning approach for vocal fold segmentation and disorder classification based on ensemble method. S. M. Nuruzzaman Nobel. , S. M. Masfequier Rahman Swapno. & M. F. Mridha.
Machine learning
Computational design of soluble and functional membrane protein analogues. A deep learning approach enables accurate computational design of soluble and functional analogues of membrane proteins ...
777306 PDFs
Explore the latest full-text research PDFs, articles, conference papers, preprints and more on MACHINE LEARNING. Find methods information, sources, references or conduct a literature review on ...
Machine Learning: Models, Challenges, and Research Directions
Machine learning techniques have emerged as a transformative force, revolutionizing various application domains, particularly cybersecurity. The development of optimal machine learning applications requires the integration of multiple processes, such as data pre-processing, model selection, and parameter optimization. While existing surveys have shed light on these techniques, they have mainly ...
A comprehensive study of groundbreaking machine learning research
Furthermore, the evolving nature of the machine learning field, along with changes in terminology and subfields, may impact the identification of highly impactful papers across different decades. Interdisciplinary collaboration is a common aspect of ML research, making it challenging to carefully categorize papers within traditional boundaries ...
Machine Learning: Algorithms, Real-World Applications and Research
A successful machine learning model depends on both the data and the performance of the learning algorithms. The sophisticated learning algorithms then need to be trained through the collected real-world data and knowledge related to the target application before the system can assist with intelligent decision-making.
Journal of Machine Learning Research
(Machine Learning Open Source Software Paper) Torchhd: An Open Source Python Library to Support Research on Hyperdimensional Computing and Vector Symbolic Architectures Mike Heddes, Igor Nunes, Pere Vergés, Denis Kleyko, Danny Abraham, Tony Givargis, Alexandru Nicolau, Alexander Veidenbaum; (255):1−10, 2023.
Machine Learning
Machine Learning. Abstract: In machine learning, a computer first learns to perform a task by studying a training set of examples. The computer then performs the same task with data it hasn't encountered before. This article presents a brief overview of machine-learning technologies, with a concrete case study from code analysis.
[2407.20516] Machine Unlearning in Generative AI: A Survey
Generative AI technologies have been deployed in many places, such as (multimodal) large language models and vision generative models. Their remarkable performance should be attributed to massive training data and emergent reasoning abilities. However, the models would memorize and generate sensitive, biased, or dangerous information originated from the training data especially those from web ...
The Impact of Machine Learning Algorithms on Academic Research Papers
Machine learning algorithms have become increasingly prevalent in the field of academic research, revolutionizing the way researchers approach and analyze data. These advanced computational techniques have the potential to uncover patterns, make predictions, and generate insights that were previously unattainable through traditional research methods. As the volume and complexity of data ...
Apple Intelligence Foundation Language Models
This paper provides technical details for Apple's On-Device and Server Foundation Models, introduced on June 10, 2024, in this post. Related readings and updates. ... Our research in machine learning breaks new ground every day. Work with us. Machine Learning Research; Research;
Think big: Testing brain-behavior machine learning requires large
When designing machine learning models, researchers first train the models to recognize data patterns and then test their effectiveness. ... Statistical power is the probability that a research study will detect an effect if one exists. For example, a child's height is closely related to their age. If a study is adequately powered, then that ...
Research Paper Classification Using Machine and Deep Learning
Categorizing research papers into appropriate categories is one of the many components when organizing conferences, especially in the context of paper submissions and calls for papers. ... (GBT) algorithm, a machine learning technique, outperformed other algorithms, achieving a remarkable classification accuracy of 91.58% followed by the Deep ...
Identification of Dominant Phenomena and Analysis of Coastal ...
Machine learning advances, establishing research sites, and collecting environmental and field data in coastal regions provide significant information that can be exploited to achieve coastal engineering goals. This study attempts to analyze Narrabeen Beach from a different aspect using ML.
Researchers identify over 2,000 potential toxins using machine learning
The research paper, titled "Identification of novel toxins associated with the extracellular contractile injection system using machine learning" is published in Molecular Systems Biology.
JMLR Papers
JMLR Papers. Select a volume number to see its table of contents with links to the papers. Volume 23 (January 2022 - Present) . Volume 22 (January 2021 - December 2021) . Volume 21 (January 2020 - December 2020) . Volume 20 (January 2019 - December 2019) . Volume 19 (August 2018 - December 2018) . Volume 18 (February 2017 - August 2018) . Volume 17 (January 2016 - January 2017)