Developer Center

Resources to get you started with Algorithmia

Scala

Updated

Before you get started learning about Scala algorithm development, make sure you go through our Getting Started Guide to learn how to create your first algorithm, understand permissions available, versioning, using the CLI, and more.

Available APIs

Algorithmia makes a number of libraries available to make algorithm development easier. The full Java 11 language and standard library is available for you to use in your algorithms. Furthermore, algorithms can call other algorithms and manage data on the Algorithmia platform via the Algorithmia Scala Client.

Managing Dependencies

Algorithmia supports adding 3rd party dependencies via Maven packages. Specifically, any packages from Maven Central can be added to algorithms. On the algorithm editor page, click Options and select Manage Dependencies.

Add dependencies by adding lines of the following form:

libraryDependencies += GroupID % ArtifactID % Version

For example, to add Apache Commons Math version 3.4.1:

libraryDependencies += "org.apache.commons" % "commons-math3" % "3.4.1"

Automatic JSON parsing

package algorithmia.Example

import com.algorithmia._
import com.algorithmia.algo._
import com.algorithmia.data._

class Example {
  def apply(dict: Map[String,String], key: String): String = {
    dict(key)
  }
}

By default, Algorithmia uses Google’s GSON library for converting JSON to and from native Java objects. You can specify the input and output types of your algorithm simply by setting the parameters and return type of your apply() method.

GSON is a pure java library and does not support many scala native types. For example, List[Int] does not automatically parse, but Array[Int] will. This is because Array in scala is actually a Java array. Similarly, java.util.Map will parse correctly, but scala.collection.Map will not.

This example shows a function that takes two parameters, a Map from Strings to Strings (dict) and another String (key), and returns another String.

Algorithmia can automatically parse many types of native Java objects to and from JSON: Integers, Lists, Arrays, Maps, and many others. In many cases it can also parse arbitrary user-defined Java Classes to and from JSON. See the Gson User Guide for reference.

Custom JSON parsing

If you want more control over parsing, then use a single apply method accepting a String and give it the @AcceptsJson annotation (from the com.algorithmia.algo package).

package algorithmia.Example

import com.algorithmia._
import com.algorithmia.algo._
import com.algorithmia.data._

class Example {
  @AcceptsJson
  def apply(jsonString: String): String = {
    // Do JSON parsing here
  }
}

On the other hand, if GSON doesn’t serialize your output response to JSON correctly (or you want to do some custom serialization) you can add an @ReturnsJson to your apply method and return a serialized JSON String.

package algorithmia.Example

import com.algorithmia._
import com.algorithmia.algo._
import com.algorithmia.data._

class Example {
  @ReturnsJson
  def apply(foo: String, bar: String): String = {
    // Do some work
    // Return a JSON string
  }
}

Advanced Serialization Techniques

Not every algorithm is stateless, and sometimes you need to preserve state in the data API. Ensuring that your algorithm state can be downloaded and deserialized quickly and efficiently is critical for ensuring that your algorithm executes in a reasonable time frame.

For state serialization in scala, we recommend boopickle as it allows you to serialize and deserialize into binary faster than any equivalent json parser, and serializes to a much smaller footprint than the equivalent JSON.

Error Handling

throw new AlgorithmException("Invalid graph structure")

Algorithms can throw any exception, and they will be returned as an error via the Algorithmia API. If you want to throw a generic exception message, use an AlgorithmException.

Writing files for the user to consume

Sometimes it is more appropriate to write your output to a file than to return it directly to the caller. In these cases, you may need to create a temporary file, then copy it to a Data URI (usually one which the caller specified in their request, or a Temporary Algorithm Collection):

val file_uri = "data://username/collection/filename.txt"
val tempfile = new File("/tmp/"+uuid()+".tmp")
save_some_output_to(tempfile)
client.file(file_uri).putFile(tempfile)

Working with directories

While running, algorithms have access to a temporary filesystem located at /tmp, the contents of which do not persist across calls to the algorithm. While the Data API allows you to get the contents of the files you want to work with as JSON, a string, or raw bytes, in some cases you might need your algorithm to read and write files locally. This can be useful as a temporary location to store files downloaded from Hosted Data, such as raw data for processing or models to be loaded into your algorithms. It can also be used to write new files before uploading them via the Data API.

For reference, this gist provides an example of iterating over data in a directory, processing it, and writing new data to a file, while this template for ALBERT and Tensorflow provides an example of using the /tmp directory to load a model.

Calling Other Algorithms and Managing Data

To call other algorithms or manage data from your algorithm, use the Algorithmia & Scala which is automatically available to any algorithm you create on the Algorithmia platform. For more detailed information on how to work with data see the Data API docs and learn about Algorithmia’s Hosted Data Source.

When designing your algorithm, don’t forget that there are special data directories, .session and .algo, that are available only to algorithms to help you manage data over the course of the algorithm execution.

You may call up to 24 other algorithms, either in parallel or recursively.

Additional Resources