Designing with APIs

For a long time now Data Foundry has not only been about storing data, but also about designing with data. That can mean that we need to interact with APIs (application programming interfaces) to unlock more or special functionality. On this documentation page, we collect the different APIs that have become available on Data Foundry in the last months.

Before we head into the different APIs and their usage, let's check out how to get any API access on Data Foundry: generate API keys.

API Access

Here is a step-by-step guide on how to enable API access for one or more of your projects on Data Foundry:

  1. Create a new project or use an existing project; important: you need to be the owner of the project
  2. Open the project edit page (pen icon on main project page)
  3. Scroll down to the API Access section
  4. Read information carefully and click the button to generate a new API key
  5. Copy the API key; it should look like df-AHFJed65hg09sdv098asdvadv98
  6. Add a new script or open an existimg script in the project
  7. Try a few of the examples below, using the copied API key for <API-KEY>

All APIs work with the same API key. So, once you have generated a key for your project, you can use it with all available APIs.


Before we start, a quick NOTE: The use of all APIs is controlled by Data Foundry and free for all users. We will monitor the usage of the API and also log all prompts that are submitted for quality assurance and debugging purposes. If you suspect that your API key has somehow leaked, generate a new one (which invalidates the old one) and contact us for help. By using the API you acknowledge these terms.


OpenAI API

The first API that we have integrated in Data Foundry is the OpenAI API, which gives you access to advanced AI, large language models (LLMs) that can do a variety of things based on textual prompts.

In the following, we explain the basics of sending requests to the API, which is documented in more depth in the OpenAI for scripts page.

OpenAI: DF scripting

You can use the OpenAI API in scripts on Data Foundry. All API capabilities are available for direct HTTP API calls, and access from Python and JavaScript, see below.

OpenAI: Command line

You can now send API requests directly from any prototype that can access the web and send HTTP requests, e.g., an ESP32 or RaspberryPI. We show the command line call for Mac/Linux below as an example:

curl -X POST -H 'Content-Type: application/json' \
-d '{"api_token": "df-123456789....", "task": "completion",  "model": "ada",  "prompt": "Imagine you are a tomato, what is your biggest goal in life?"}' \
 https://data.id.tue.nl/api/vendor/openai/<PROJECT_ID>

What do you need?

  1. Your project id which needs to match the generated API key. The project id needs to be appended to the URL, replace <PROJECT_ID> with the number
  2. The completion, chat or moderation request in JSON format, which is exactly the same format as you would use in scripting

OpenAI: Python

import requests

api_key = 'df-abcdef1234567890abcdef1234567890abcdef123456789='
messages = []
headers = {'Content-Type': 'application/json'}

# add user prompt to history
messages.append({'role':'user', 'content': 'this is a mouse, who are you?'})

# check the OpenAI for scripting docs for more details
data = {
	'api_token': api_key,
	'task': 'chat',
	'messages': messages
}
response = requests.post('https://data.id.tue.nl/api/vendor/openai/<PROJECT-ID>', headers=headers, json=data)

# raw response output for debugging
print(response.text)

# use response parsed as JSON
jsonResponse = response.json()
print(jsonResponse['content'])

OpenAI: JavaScript

var api_key = "df-abcdef1234567890abcdef1234567890abcdef123456789="
var messages = [{"role": "user", "content": "this is a mouse, who are you?"}]

// send a POST request to the API
fetch("http://data.id.tue.nl/api/vendor/openai/<PROJECT-ID>", {
	method: "POST",
	cache: "no-cache",
	headers: {
		"Content-Type": "application/json"
	},
	referrerPolicy: "no-referrer",
	body: JSON.stringify({
		"api_token": api_key, 
		"task": "chat",
		"messages": messages
	}),
})
.then((response) => response.json())
.then((json) => {
	// check content of response
	console.log(json)
	// json.content contains the generated chat response
	chatResponse = json.content
});

Text-to-Speech

Text-to-Speech (TTS) is the synthesis of speech from a textual input. For example, TTS will turn the sentence "this is a dog, not a cat" into an audio file that contains the speech of "this is a dog, not a cat". At the moment, we using the tool espeak for this, which very fast and convenient, but does not always sound great. Espeak supports different languages, which you can access with the lang request parameter. English is the default, other languages can be configured in their two-character country code (nl for Dutch, de for German, fr for French, etc.). We are working on an improved TTS pipeline with better voice control and synthesis options. Stay tuned.

Successful TTS API calls will return a JSON response that contains a token which can be used to download the generated audio file. The audio files are kept on the server for about 10 mins, after which they are deleted.

TTS: DF scripting

The use of the TTS API is very simple in DF scripts: Just call the DF.api function with the API type t2s as the first parameter and the api_key and text to speak as the second parameter.

let result = DF.api("t2s", {
  "api_token": "df-abcdef1234567890abcdef1234567890abcdef123456789",
  "lang": "en", // or "nl" for Dutch, "de" for German, etc.
  "text": "this is a test, do your best!"
})
DF.print(result)

TTS: Command line

curl -X POST -H 'Content-Type: application/json' \
	-d '{"api_token": "df-123456789....", "text": "this is a test, do your best!"}' \
	https://data.id.tue.nl/api/vendor/t2s/<PROJECT_ID>

If the call is successful, the response is a JSON object with a token that you can use to download the generated audio file:

curl -O https://data.id.tue.nl/api/vendor/t2s/<TOKEN>

TTS: Python

import requests

api_key = 'df-abcdef1234567890abcdef1234567890abcdef123456789='
headers = {'Content-Type': 'application/json'}

# check the OpenAI for scripting docs for more details
data = {
	'api_token': api_key,
  'lang': 'en', # or 'nl' for Dutch, 'de' for German, etc.
	'text': 'this is a test, do your best!'
}
response = requests.post('https://data.id.tue.nl/api/vendor/t2s/<PROJECT-ID>', headers=headers, json=data)

# raw response output for debugging
print(response.text)

# use response token to construct download link
print("https://data.id.tue.nl/api/vendor/t2s/" + jsonResponse['content'])

TTS: JavaScript

// send a POST request to the API
fetch("http://data.id.tue.nl/api/vendor/t2s/<PROJECT-ID>", {
  method: "POST",
  cache: "no-cache",
  headers: {
    "Content-Type": "application/json"
  },
  referrerPolicy: "no-referrer",
  body: JSON.stringify({
      api_token: "df-abcdef1234567890abcdef1234567890abcdef123456789", 
		  lang: "en", // or "nl" for Dutch, "de" for German, etc.
      text: "this is a test, do your best!"
  }),
})
.then((response) => response.json())
.then((json) => {
  // check content of response
  console.log(json)
  // json.text contains the token that you can use to create the download link
  downloadLink = "https://data.id.tue.nl/api/vendor/t2s/" + json.text
});

When you use hte TTS API on a website, you can use the download link in an <audio> element. Just set the src attribute of the element and you have an audio player for the generated speech.


Speech-to-Text

Speech-to-Text (STT) is the translation of an audio recording of speech, e.g., as a .wav or .mp3 file, into the text that was spoken. While this is pretty straight-forward for humans (provided that they are speaking the right language and are not eating a delicious hamburger at the moment), but it's hard for machines. In the past years there was some progress in this area. And today, we can do this easily on a server or even locally on a mobile device. In the following, we will explain how you can make use of this API in your designs. Note: We currently do not support SST in scripting, because the processing of images in scripts is not well-developed enough.

We are currently using the Whisper technology that was open-sourced by OpenAI, and we are running this on the Data Foundry server, not somewhere else in the cloud. That means, you can safely use this service for STT, even for personal data (provided you have an appropriate and signed ERB form).

Whenever you submit a piece of audio to the server, the Whisper model needs to start, process and return the extracted text. That means, it can take a few seconds to return API calls. The longer the audio that you submit, the longer the wait will be. Also, transmit audio chunks that are longer than 3-5 seconds, otherwise the model does not have enough context to properly extract speech.

STT: JavaScript

We first built this API for JavaScript and web access. Even then, the handling of audio is not the easiest. Check out the full example below for some ideas:

<!DOCTYPE html>
<html>
<head>
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Audio API Test</title>
	<link rel="stylesheet" href="https://unpkg.com/@picocss/pico@1.*/css/pico.min.css">
    <script src="https://www.WebRTC-Experiment.com/RecordRTC.js"></script>
</head>
<body>
	<main class="container">
		<h2>Audio API Test</h2>
		<p>
			Welcome to the test page, click button to start recording and the transcription will appear automatically:
		</p>
		<p>
			<button id="recordButton">Record!</button>
		</p>
		<p>
			<code id="dataDisplay"></code>
		</p>		
	</main>
	<script type="text/javascript">
		document.addEventListener("DOMContentLoaded", (event) => {

			// register button handler
			document.querySelector('#recordButton').addEventListener('click', (event) => {
				startRecording()
			})

			let recordAudio;

			function startRecording() {
				// make use of HTML 5/WebRTC, JavaScript getUserMedia()
				// to capture the browser microphone stream
				navigator.mediaDevices.getUserMedia({
					audio: true
				}).then(function(stream) {
					recordAudio = RecordRTC(stream, {
						type: 'audio',
						mimeType: 'audio/webm',
						sampleRate: 44100,
						recorderType: StereoAudioRecorder,
						numberOfAudioChannels: 1,
						desiredSampRate: 16000,

						// get intervals based blobs
						// value in milliseconds
						// as you might not want to make detect calls every seconds
						timeSlice: 10000, // <--- this is the only value you can change

						ondataavailable: function(blob) {
							var reader = new window.FileReader();
							reader.readAsDataURL(blob);
							reader.onloadend = function() {
								base64data = reader.result;

								fetch("/api/vendor/s2t/<PROJECT-ID>", {
										method: "POST",
										mode: "cors",
										cache: "no-cache",
										headers: {
											"Content-Type": "application/json",
										},
										redirect: "follow",
										referrerPolicy: "no-referrer",
										body: JSON.stringify({
											api_token: "df-abcdef1234567890abcdef1234567890abcdef123456789=",
											audio: base64data
										}),
									})
									.then((response) => response.json())
									.then((json) => {
										console.log(json)

										// write the recognized speech to HTML element
										dataDisplay.innerHTML += json.text.replace("-", "")
									});
							}
						},
					})

					recordAudio.startRecording();
				}).catch((err) => {
					// always check for errors at the end.
					console.error(`${err.name}: ${err.message}`);
				});
			}
		});
	</script>
</body>
</html>

STT: others

The STT API is also available via command line and Python, however, you would need to first find a way to insert the audio stream (PCM 16bit) as a byte string. Let us know if you struggle with this, we can help.