Title: | Interface to the Google Cloud Machine Learning Platform |
---|---|
Description: | Interface to the Google Cloud Machine Learning Platform <https://cloud.google.com/ml-engine>, which provides cloud tools for training machine learning models. |
Authors: | Daniel Falbel [aut, cre], Javier Luraschi [aut], JJ Allaire [aut], Kevin Ushey [aut], RStudio [cph] |
Maintainer: | Daniel Falbel <[email protected]> |
License: | Apache License 2.0 |
Version: | 0.6.1 |
Built: | 2024-11-10 05:46:20 UTC |
Source: | https://github.com/cran/cloudml |
Deploys a SavedModel to CloudML model for online predictions.
cloudml_deploy(export_dir_base, name, version = paste0(name, "_1"), region = NULL, config = NULL)
cloudml_deploy(export_dir_base, name, version = paste0(name, "_1"), region = NULL, config = NULL)
export_dir_base |
A string containing a directory containing an
exported SavedModels. Consider using |
name |
The name for this model (required) |
version |
The version for this model. Versions start with a letter and contain only letters, numbers and underscores. Defaults to name_1 |
region |
The region to be used to deploy this model. |
config |
A list, |
Other CloudML functions: cloudml_predict
,
cloudml_train
Perform online prediction over a CloudML model, usually, created using
cloudml_deploy()
cloudml_predict(instances, name, version = paste0(name, "_1"), verbose = FALSE)
cloudml_predict(instances, name, version = paste0(name, "_1"), verbose = FALSE)
instances |
A list of instances to be predicted. While predicting a single instance, list wrapping this single instance is still expected. |
name |
The name for this model (required) |
version |
The version for this model. Versions start with a letter and contain only letters, numbers and underscores. Defaults to name_1 |
verbose |
Should additional information be reported? |
Other CloudML functions: cloudml_deploy
,
cloudml_train
Upload a TensorFlow application to Google Cloud, and use that application to train a model.
cloudml_train(file = "train.R", master_type = NULL, flags = NULL, region = NULL, config = NULL, collect = "ask", dry_run = FALSE)
cloudml_train(file = "train.R", master_type = NULL, flags = NULL, region = NULL, config = NULL, collect = "ask", dry_run = FALSE)
file |
File to be used as entrypoint for training. |
master_type |
Training master node machine type. "standard" provides a basic machine configuration suitable for training simple models with small to moderate datasets. See the documentation at https://cloud.google.com/ml-engine/docs/tensorflow/machine-types#machine_type_table for details on available machine types. |
flags |
Named list with flag values (see |
region |
The region to be used for training. |
config |
A list, |
collect |
Logical. If TRUE, collect job when training is completed
(blocks waiting for the job to complete). The default ( |
dry_run |
Triggers a local dry run over the deployment phase to validate packages and packing work as expected. |
job_status()
, job_collect()
, job_cancel()
Other CloudML functions: cloudml_deploy
,
cloudml_predict
## Not run: library(cloudml) gcloud_install() job <- cloudml_train("train.R") ## End(Not run)
## Not run: library(cloudml) gcloud_install() job <- cloudml_train("train.R") ## End(Not run)
Initialize the Google Cloud SDK
gcloud_init()
gcloud_init()
Other Google Cloud SDK functions: gcloud_install
,
gcloud_terminal
Installs the Google Cloud SDK which enables CloudML operations.
gcloud_install(update = TRUE)
gcloud_install(update = TRUE)
update |
Attempt to update an existing installation. |
Other Google Cloud SDK functions: gcloud_init
,
gcloud_terminal
## Not run: library(cloudml) gcloud_install() ## End(Not run)
## Not run: library(cloudml) gcloud_install() ## End(Not run)
Create an RStudio terminal with access to the Google Cloud SDK
gcloud_terminal(command = NULL, clear = FALSE)
gcloud_terminal(command = NULL, clear = FALSE)
command |
Command to send to terminal |
clear |
Clear terminal buffer |
Terminal id (invisibly)
Other Google Cloud SDK functions: gcloud_init
,
gcloud_install
Get version of Google Cloud SDK components.
gcloud_version()
gcloud_version()
a list with the version of each component.
Use the gsutil cp
command to copy data between your local file system and
the cloud, copy data within the cloud, and copy data between cloud storage
providers.
gs_copy(source, destination, recursive = FALSE, echo = TRUE)
gs_copy(source, destination, recursive = FALSE, echo = TRUE)
source |
The file to be copied. This can be either a path on the local
filesystem, or a Google Storage URI (e.g. |
destination |
The location where the |
recursive |
Boolean; perform a recursive copy? This must be specified if you intend on copying directories. |
echo |
Echo command output to console. |
Refer to data within a Google Storage bucket. When running on CloudML the bucket will be read from directly. Otherwise, the bucket will be automatically synchronized to a local directory.
gs_data_dir(url, local_dir = "gs", force_sync = FALSE, echo = TRUE)
gs_data_dir(url, local_dir = "gs", force_sync = FALSE, echo = TRUE)
url |
Google Storage bucket URL (e.g. |
local_dir |
Local directory to synchonize Google Storage bucket(s) to. |
force_sync |
Force local synchonization even if the data directory already exists. |
echo |
Echo command output to console. |
This function is suitable for use in TensorFlow APIs that accept
gs:// URLs (e.g. TensorFlow datasets). However, many package functions
accept only local filesystem paths as input (rather than
gs:// URLs). For these cases you can the gs_data_dir_local()
function,
which will always synchronize gs:// buckets to the local filesystem and
provide a local path interface to their contents.
Path to contents of data directory.
Provides a local filesystem interface to Google Storage buckets. Many
package functions accept only local filesystem paths as input (rather than
gs:// URLs). For these cases the gcloud_path()
function will synchronize
gs:// buckets to the local filesystem and provide a local path interface
to their contents.
gs_data_dir_local(url, local_dir = "gs", echo = FALSE)
gs_data_dir_local(url, local_dir = "gs", echo = FALSE)
url |
Google Storage bucket URL (e.g. |
local_dir |
Local directory to synchonize Google Storage bucket(s) to. |
echo |
Echo command output to console. |
If you pass a local path as the url
it will be returned
unmodified. This allows you to for example use a training flag for the
location of data which points to a local directory during
development and a Google Cloud bucket during cloud training.
Local path to contents of bucket.
For APIs that accept gs:// URLs directly (e.g. TensorFlow datasets)
you should use the gs_data_dir()
function.
The gs_rsync
function makes the contents under destination
the same
as the contents under source
, by copying any missing files/objects (or
those whose data has changed), and (if the delete
option is specified)
deleting any extra files/objects. source
must specify a directory, bucket,
or bucket subdirectory.
gs_rsync(source, destination, delete = FALSE, recursive = FALSE, parallel = TRUE, dry_run = FALSE, options = NULL, echo = TRUE)
gs_rsync(source, destination, delete = FALSE, recursive = FALSE, parallel = TRUE, dry_run = FALSE, options = NULL, echo = TRUE)
source |
The file to be copied. This can be either a path on the local
filesystem, or a Google Storage URI (e.g. |
destination |
The location where the |
delete |
Delete extra files under |
recursive |
Causes directories, buckets, and bucket subdirectories to
be synchronized recursively. If you neglect to use this option
|
parallel |
Causes synchronization to run in parallel. This can significantly improve performance if you are performing operations on a large number of files over a reasonably fast network connection. |
dry_run |
Causes rsync to run in "dry run" mode, i.e., just outputting what would be copied or deleted without actually doing any copying/deleting. |
options |
Character vector of additional command line options to the gsutil rsync command (as specified at https://cloud.google.com/storage/docs/gsutil/commands/rsync). |
echo |
Echo command output to console. |
Cancel a job.
job_cancel(job = "latest")
job_cancel(job = "latest")
job |
Job name or job object. Pass "latest" to indicate the most recently submitted job. |
Other job management functions: job_collect
,
job_list
, job_status
,
job_stream_logs
, job_trials
Collect the job outputs (e.g. fitted model) from a job. If the job has not
yet finished running, job_collect()
will block and wait until the job has
finished.
job_collect(job = "latest", trials = "best", destination = "runs", timeout = NULL, view = interactive())
job_collect(job = "latest", trials = "best", destination = "runs", timeout = NULL, view = interactive())
job |
Job name or job object. Pass "latest" to indicate the most recently submitted job. |
trials |
Under hyperparameter tuning, specifies which trials to
download. Use |
destination |
The destination directory in which model outputs should
be downloaded. Defaults to |
timeout |
Give up collecting job after the specified minutes. |
view |
View the job results after collecting it. You can also pass
"save" to save a copy of the run report at |
Other job management functions: job_cancel
,
job_list
, job_status
,
job_stream_logs
, job_trials
List existing Google Cloud ML jobs.
job_list(filter = NULL, limit = NULL, page_size = NULL, sort_by = NULL, uri = FALSE)
job_list(filter = NULL, limit = NULL, page_size = NULL, sort_by = NULL, uri = FALSE)
filter |
Filter the set of jobs to be returned. |
limit |
The maximum number of resources to list. By default, all jobs will be listed. |
page_size |
Some services group resource list output into pages. This flag specifies the maximum number of resources per page. The default is determined by the service if it supports paging, otherwise it is unlimited (no paging). |
sort_by |
A comma-separated list of resource field key names to
sort by. The default order is ascending. Prefix a field
with |
uri |
Print a list of resource URIs instead of the default output. |
Other job management functions: job_cancel
,
job_collect
, job_status
,
job_stream_logs
, job_trials
Get the status of a job, as an R list.
job_status(job = "latest")
job_status(job = "latest")
job |
Job name or job object. Pass "latest" to indicate the most recently submitted job. |
Other job management functions: job_cancel
,
job_collect
, job_list
,
job_stream_logs
, job_trials
Show logs from a running Cloud ML Engine job.
job_stream_logs(job = "latest", polling_interval = getOption("cloudml.stream_logs.polling", 5), task_name = NULL, allow_multiline_logs = FALSE)
job_stream_logs(job = "latest", polling_interval = getOption("cloudml.stream_logs.polling", 5), task_name = NULL, allow_multiline_logs = FALSE)
job |
Job name or job object. Pass "latest" to indicate the most recently submitted job. |
polling_interval |
Number of seconds to wait between efforts to fetch the latest log messages. |
task_name |
If set, display only the logs for this particular task. |
allow_multiline_logs |
Output multiline log messages as single records. |
Other job management functions: job_cancel
,
job_collect
, job_list
,
job_status
, job_trials
Get the hyperparameter trials for job, as an R data frame
job_trials(x)
job_trials(x)
x |
Job name or job object. |
Other job management functions: job_cancel
,
job_collect
, job_list
,
job_status
, job_stream_logs