Kotlin is a programing language that is readable, concise, and easy to learn. Since it is a JVM language, it provides great performance and you can leverage an entire ecosystem of Java libraries.
Null safety and static typing help to write readable and maintainable code which is much easier to troubleshoot.
All these characteristics make Kotlin useful for working with data – from data pipelines to machine learning models. Read on to find out more about why Kotlin is great for data science.
Interactive editors
Notebooks like Apache Zeppelin and Jupyter Notebook provide excellent tools for exploratory research and data visualization. Kotlin can be integrated with these tools so that you can easily explore data and share results with your teammates, or to help you build up you machine learning and data science skills.
Zeppelin Kotlin interpreter
A popular web-based tool that can be used for interactive data analysis. It offers reliable support for the Apache Spark cluster computing system and this is the reason why it can be utilized for data engineering. From version 0.9.0, Apache Zeppelin comes with a Kotlin interpreter.
Jupyter Kotlin kernel
An open-source web application you can use to create and share notebooks with code, visualization, and markdown text. Kotlin-Jupyter is a project that brings Kotlin support to Jupyter Notebook.
Libraries
There is a rapidly expanding ecosystem of libraries that can be used for data-related tasks. These libraries are created by the Kotlin community and here are the ones that are most useful:
Kotlin libraries
- KotlinDL – a high-level deep learning API you can use to train deep learning models from scratch or import existing Keras models for inference and leverage transfer learning for making changes to existing pre-trained models.
- Kotlin for Apache Spark – allows you to use familiar language features like data classes and adds a missing layer of compatibility between Kotlin and Apache Spark.
- kotlin-statistics – provides extension functions for exploratory and production statistics. Supports slicing operators, basic numeric list/sequence/array functions, binning operations, naive bayes classifier, discrete PDF sampling, linear regression, clustering, and much more.
- kmath – supports algebraic operations and structures, array-like structures, histograms, math expressions, streaming operations, and more.
- krangl – can be used for data manipulation through a functional-style API; offers functions for filtering, aggregating, transforming, and reshaping tabular data.
- lets-plot – a multiplatform that can be used for statistical data written in Kotlin with JVM, as well as with Python and JS.
- kravis – library for visualization of tabular data.
Java libraries
Kotlin provides high-class interoperability with Java, so you can also use Java libraries for data science with Kotlin. There are several examples that can be particularly useful:
- DeepLearning4J – a deep learning Java library
- ND4J – a matrix math library for JVM
- Dex – a data visualization tool
- Smile – a machine learning, linear algebra, natural language processing, interpolation, graph, and visualization system. In addition to Java API, it also provides a functional Kotlin API, as well as Clojure and Scala API.
- Apache Commons Math – general math, machine learning, and statistics Java library
- Charts – a scientific charting library
- CoreNLP – a natural language processing toolkit
- Apache Mahout – a distributed framework for clustering, regression, and recommendation
- Weka – a collection of machine learning algorithms