magpie package¶
Contents¶
The Python API for Magpie.
- Public classes:
MagpieContext
: Main entry point for Magpie functionality.MagpieNotebookContext
: A context object with helper methods for interacting with Magpie in a notebook setting.MagpieError
: An error used to wrap java exceptions thrown by Magpie commands.MagpieInfo
: Information about a Magpie session.MagpieRows
: A collection of rows returned from a Magpie command.MagpieVariables
: A class used for accessing variables.MagpieSecrets
: A class used for accessing secrets
- class magpie.MagpieContext(jmc, spark)¶
The main entry point for Magpie in python.
The MagpieContext is used to execute commands, run SQL queries, get data frames for Magpie tables, and more.
Available as
mc
in python script tasks and python notebook blocks.Example usage:
>>> flavors = ["apple", "banana", "strawberry"] >>> df = mc.getTableDataFrame("store_sales") >>> for f in flavors: >>> df.filter(df.flavor == f).createOrReplaceTempView("filtered_sales") >>> mc.sql("select * from filtered_sales") >>> mc.execute("save result as table %s_sales" % f)
- about()¶
Return information about the current Magpie session
>>> info = mc.about() >>> info.organization 'Silectis'
- Returns
current session
MagpieInfo
- clearVariable(name)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.variables.remove(name)
instead.Unset a particular variable in the context.
>>> mc.setVariable("age", 5) >>> mc.clearVariable("age") >>> mc.hasVariable("age") False
- Parameters
name – Variable name
- clearVariables()¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.variables.clear()
instead.Clear all variables in the context.
>>> mc.setVariable("age", 5) >>> mc.clearVariables() >>> mc.hasVariable("age") False
- execute(command)¶
Execute a single command and block until it completes
>>> mc.execute("save result as table people")
- Parameters
command – Command string to execute
- Returns
Command result. Either None, a String (JSON), or
MagpieRows
.
- getTableDataFrame(table)¶
Get the data frame for a Magpie table, optionally qualified by schema.
>>> df = mc.getTableDataFrame("people") >>> df.show() +-----+----+ | name| age| +-----+----+ |Alice| 2| | Bob| 5| +-----+----+
- Parameters
table – table name
- Returns
table
pyspark.sql.DataFrame
- getVariable(name)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.variables.get(name)
instead.Get the value of a variable.
>>> mc.setVariable("age", 5) >>> mc.getVariable("age") 5
- Parameters
name – Variable name
- Returns
Variable value
- hasVariable(name)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.variables.exists(name)
instead.Determine whether a variable is defined on the context.
>>> mc.setVariable("age", 5) >>> mc.hasVariable("age") True
- Parameters
name – Variable name
- Returns
Whether the variable is defined
- interpret(command)¶
Execute the provided command and render the result visually
- Parameters
command – Command to execute
For example, to render the first 100 rows of the table people:
>>> mc.interpret("show 100 from people")
- printVariables()¶
This function is deprecated and will be removed in a future Magpie release.
Get a listing of all variables defined on the context and their values.
>>> mc.setVariable("age", 5) >>> mc.setVariable("color", "red") >>> mc.printVariables() age = 5 color = red
- Returns
Variable listing
- profile(df)¶
Profile a data frame and render the profile visually
- Parameters
df –
pyspark.sql.DataFrame
to profile
>>> mc.profile(df)
- result()¶
Get the last result as a spark data frame
>>> mc.sql("select * from people") >>> df = mc.result() >>> df.show() +-----+----+ | name| age| +-----+----+ |Alice| 2| | Bob| 5| +-----+----+
- Returns
last result
pyspark.sql.DataFrame
- setVariable(name, value)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.variables.set(name, value)
instead.Set a variable to the given value in the context.
>>> mc.setVariable("age", 5) >>> mc.getVariable("age") 5
- Parameters
name – Variable name
value – Variable value
- sql(sql)¶
Execute a SQL command, returning the resulting data frame.
>>> df = mc.sql("select distinct name from people") >>> df.collect() ['Alice', 'Bob']
- Parameters
sql – SQL statement
- Returns
pyspark.sql.DataFrame
result
- substituteVariables(input)¶
This function is deprecated and will be removed in a future Magpie release.
Note: substitution is performed automatically on arguments passed to
execute()
andsql()
.Substitute any variables present in the input string with current values stored in the context.
An exception is thrown if any variables are not able to be matched.
>>> mc.setVariable("age", 5) >>> mc.substituteVariables("select * from people where age = $#age") select * from people where age = 5
- Parameters
input – Input string
- Returns
String with variables substituted for their values
- exception magpie.MagpieError(java_exception)¶
Exception propagated from Java when performing a Magpie action.
- class magpie.MagpieInfo(info)¶
Information about a Magpie session.
- property cluster¶
Current cluster name, if set
- property instance¶
Cluster instance name
- property name¶
Name of the application (Magpie)
- property organization¶
Current organization name, if set
- property project¶
Current project name, if set
- property repository¶
Current repository name, if set
- property schema¶
Current schema name, if set
- property user¶
Current user name, if set
- property version¶
Application version
- class magpie.MagpieNotebookContext(mc)¶
Magpie context wrapper with notebook utility methods.
Available as
magpie
in python blocks in the notebook.- interpret(command)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.interpret(command)
instead.Execute the provided command and render the result visually in the notebook
- Parameters
command – Command to execute
For example, to render the first 100 rows of the table people in the notebook:
>>> magpie.interpret("show 100 from people")
- profile(df)¶
This function is deprecated and will be removed in a future Magpie release. Use
mc.profile(command)
instead.Profile a data frame and render the profile visually in the notebook
- Parameters
df –
pyspark.sql.DataFrame
to profile
>>> magpie.profile(df)
- class magpie.MagpieRows(rows, totalCount)¶
A collection of rows returned from a Magpie command, optionally including the total count of the source data set.
When a total count is requested:
>>> res = mc.execute("show 2 from people with count") >>> res.rows [Row(age=2, name='Alice'), Row(age=5, name='Bob')] >>> res.totalCount 100
And when a total count is not requested:
>>> res2 = mc.execute("show 2 from people") >>> res2.rows [Row(age=2, name='Alice'), Row(age=5, name='Bob')] >>> res2.totalCount None
- property rows¶
A list of
pyspark.sql.Row
returned by the command, with columns converted to strings for display>>> res = mc.execute("show 2 from people") >>> res.rows [Row(age=2, name='Alice'), Row(age=5, name='Bob')]
- property totalCount¶
Optionally, the total size of the source data set for the command
>>> res = mc.execute("show 2 from people with count") >>> res.totalCount 100
- class magpie.MagpieSecrets(jmc)¶
A context object for accessing Magpie Secrets in python.
MagpieSecrets can be accessed via the
secrets
field of the MagpieContextExample usage:
>>> mc.secrets.list() ['api_key', 'password'] >>> mc.secrets.exists("api_key") True >>> mc.secrets.get("api_key") my_api_key_contents
- exists(name)¶
Checks if a secret by the given name exists.
A return value of True does not imply that the callee has permission to read or write the specified secret
>>> mc.secrets.exists('api_key') True
- Parameters
name – secret name
- Returns
whether the secret is defined
- get(name)¶
Get the value of a secret.
>>> mc.secrets.get('api_key') my_api_key_contents
- Parameters
name – secret name
- Returns
secret value
- list()¶
Lists all secrets.
>>> mc.secrets.list() ['api_key', 'password']
- Returns
a list of the names of all secrets
- class magpie.MagpieVariables(jmc)¶
A context object used for accessing and updating Magpie Variables in python.
MagpieVariables can be accessed via the
variables
field of the MagpieContextExample usage:
>>> mc.variables.set("my_var", 3) >>> mc.variables.get("my_var") 3 >>> mc.variables.exists("my_var") True >>> mc.variables.clear()
- clear()¶
Clears all variables set in the context.
>>> mc.variables.set("my_var", 3) >>> mc.variables.clear() >>> mc.variables.exists("my_var") False
- exists(name)¶
Determine whether a variable is defined in the context.
>>> mc.variables.set("my_var", 3) >>> mc.variables.exists("my_var") True
- Parameters
name – variable name
- Returns
whether the variable is defined
- get(name)¶
Get the value of a variable.
>>> mc.variables.set("my_var", 3) >>> mc.variables.get("my_var") 3
- Parameters
name – variable name
- Returns
variable value
- list()¶
List all variables defined in the current context.
>>> mc.variables.set("age", 10) >>> mc.variables.set("height", 55) >>> mc.variables.list() ['age', 'height']
- Returns
A list of variable names
- remove(name)¶
Unset a particular variable in the context.
>>> mc.variables.set("my_var", 3) >>> mc.variables.remove("my_var") >>> mc.variables.exists("my_var") False
- Parameters
name – variable name
- set(name, value)¶
Set a variable to the given value in the context.
>>> mc.variables.set("my_var", 3) >>> mc.variables.get("my_var") 3
- Parameters
name – variable name
value – variable value
- Returns