We're now happy to announce the very first public version of JuliaHub.jl — the Julia language client package for the JuliaHub platform APIs. It exposes some of our backend APIs, allowing users to programmatically upload datasets and work with JuliaHub jobs. All of this can be done via the Julia REPL or scripts. It is available here on the JuliaHub package registry.
The package can be used either locally (on your machine), in JuliaHub Cloud IDEs, or in JuliaHub jobs. While it is designed to be ergonomic to use interactively in the REPL, it can also be used as a library component in applications and scripts that interact with JuliaHub. The JuliaHub.jl package is open source and available in the Julia General registry, and can be added via the package manager. See the documentation here.
Here are some of the feature highlights and examples from the initial release:
While JuliaHub's DataSets.jl integration allows you to access datasets that have been uploaded to JuliaHub, and directly use them in jobs and IDE sessions, it does not offer any way to create new or update existing ones. JuliaHub.jl complements DataSets.jl by offering a whole suite of additional functionality to work with datasets.
As an example, upload_dataset can be used to easily upload a local file or directory as a new dataset, or to update an existing one.
using JuliaHub
# When JULIA_PKG_SERVER is not set, you need to pass an explicit argument
# to JuliaHub.authenticate. This call is unnecessary when running in JuliaHub
# environments.
JuliaHub.authenticate("juliahub.com")
# update=true means that we will upload a new version if a dataset with
# this name already exists
JuliaHub.upload_dataset("my-analysis-results", "results.tar.gz", update=true)
Another potential application of the new package is to perform bulk updates or deletions of datasets. In the following example, we add a new tag to all the datasets whose name matches a particular pattern.
using JuliaHub
# When JULIA_PKG_SERVER is not set, you need to pass an explicit argument
# to JuliaHub.authenticate. This call is unnecessary when running in JuliaHub
# environments.
JuliaHub.authenticate("juliahub.com")
# Find all the datasets that have names that start with 'my-analysis-'
myanalysis_datasets = filter(
dataset -> startswith(dataset.name, r"my-analysis-.*"),
JuliaHub.datasets()
)
# .. and now add a 'new-tag' tag to each of them
for dataset in myanalysis_datasets
@info "Updating" dataset
# Note: tags = ... overrides the whole list, so you need to manually retain
# old tags.
new_tags = [dataset.tags..., "new-tag"]
JuliaHub.update_dataset(dataset, tags = new_tags)
end
There are additional things you can do. For example, JuliaHub.download_dataset allows you to easily download your datasets to your local computer. See the documentation for more examples and for the full list of capabilities.
So far, the standard ways of starting JuliaHub jobs are either doing it via the web interface, or by using the JuliaHub VS Code extension. This new client package provides another option, by allowing you to programmatically construct and submit jobs to be executed on the cloud cluster.
To illustrate, a simple job can be submitted with a single command.
using JuliaHub
# When JULIA_PKG_SERVER is not set, you need to pass an explicit argument
# to JuliaHub.authenticate. This call is unnecessary when running in JuliaHub
# environments.
JuliaHub.authenticate("juliahub.com")
# Submit a simple one-line script to JuliaHub.
job = JuliaHub.submit_job(
JuliaHub.script"""
println("Hello JuliaHub!")
"""
)
By default, your currently active package environment will also be automatically included with the job, meaning that any packages you have available in your REPL should also be available for the script. But, if that does not match your workflow, then this behavior is also completely configurable and you have full control over the environment that gets submitted to JuliaHub.
One thing to note is that any locally checked out development or private dependencies will not be included, and the job will likely fail to instantiate the Julia environment properly. But via the JuliaHub.appbundle function it is also possible to submit more complex project bundles, that can include additional files and local dependencies.
You can also use JuliaHub.jl to monitor your jobs and access the logs. For finished jobs, you can use JuliaHub.jl to access the job outputs set in the job via ENV["RESULTS"] or ENV["RESULTS_FILE"]. See the jobs guide for more information and examples.
Getting libraries to authenticate with cloud platforms can often be painful. Fortunately, JuliaHub makes it easy to obtain the authentication token that the library then uses to communicate with the instance. By simply calling the following, the package starts a simple browser-based authentication flow (or, in fact, may just re-use an existing token if that is already present). The acquired token will be stored in the Julia depot (e.g. in ~/.julia).
using JuliaHub
JuliaHub.authenticate("juliahub.com")
In principle, all of our APIs functions that require authentication take an auth = ... keyword argument, where the authentication object from JuliaHub.authenticate should be passed. However, since that can become tedious quickly, in each Julia session, JuliaHub.jl also remembers the result from the last authentication, and uses that by default.
auth = JuliaHub.authenticate("juliahub.com")
# This is fine...
JuliaHub.datasets(; auth)
# .. but unnecessary. This works equally well
JuliaHub.datasets()
Care should be taken when using JuliaHub.jl as a library, however, in which case the authentication object should always be passed explicitly --- omitting the keyword argument may lead to the code accidentally picking up the global token.
Finally, when the JULIA_PKG_SERVER environment variable is set and pointing to a JuliaHub instance, then there is no need to explicitly call JuliaHub.authenticate at all. This is the case in JuliaHub cloud environments (IDEs and jobs) and the following just works.
julia> using JuliaHub
julia> JuliaHub.datasets()
This first release of JuliaHub.jl is a beta, and there may be breaking changes to the APIs over the next few releases. Feedback based on real world usage is invaluable, and we invite anyone to open issues on the JuliaHub.jl issue tracker. We also expect to add additional capabilities to the package over time and expose more of the JuliaHub functionality programmatically.
You can login and give feedback on this form.
Please also refer to our documentation for more information, examples, and the reference readme available here.
Should you want to use these APIs to integrate your existing workflows with JuliaHub, just signup for a JuliaHub.com account for free and get started today.