class IngestionSession extends App with CustomIngest
Establishes an input record and a process record for the custom ingestion process. Also starts a heartbeat that allows DataForge to track the health of the process. When running manually, must use one of the below constructors. Otherwise, no parameters are necessary.
- Alphabetic
- By Inheritance
- IngestionSession
- CustomIngest
- Util
- App
- DelayedInit
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new IngestionSession(inSourceName: String, inProjectName: String)
- new IngestionSession(inSourceName: String)
- new IngestionSession()
-
new
IngestionSession(inEnvironment: String, inSourceName: String, inProjectName: String)
- inEnvironment
The DataForge environment this ingestion should be run against. Not required, deprecated in release 2.5
- inSourceName
The DataForge source name this ingestion should be run against. Mandatory when running manually.
- inProjectName
The DataForge project name this ingestion should be run against. If not provided, Default project will be used.
Type Members
-
case class
LatestTrackingFields(sTimestamp: Option[String], sSequence: Option[Long], inputId: Option[Int], extractDatetime: Option[String]) extends Product with Serializable
Fields tracking the latest ingestion event from the source.
Fields tracking the latest ingestion event from the source.
- sTimestamp
Optional timestamp of the last ingestion event
- sSequence
Optional sequence number from the source
- inputId
Optional input ID associated with the event
- extractDatetime
Optional extraction datetime from the source
-
case class
NormalizedAttribute(rawAttributeName: String, columnAlias: String, hiveDdlType: String, upcastExpr: Option[String]) extends Product with Serializable
- Definition Classes
- Util
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
implicit
val
AttributeSchemaReads: Reads[NormalizedAttribute]
- Definition Classes
- Util
-
def
args: Array[String]
- Attributes
- protected
- Definition Classes
- App
- Annotations
- @deprecatedOverriding( "args should not be overridden" , "2.11.0" )
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
close: Unit
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
cloud: String
- Attributes
- protected
- Definition Classes
- Util
-
implicit
val
config: Aux[MacroOptions]
- Definition Classes
- Util
-
lazy val
connectionId: Option[Int]
- Attributes
- protected
- Definition Classes
- Util
-
def
connectionParameters: JsObject
Retrieves secure connection parameters from the Databricks Secret store, including public and encrypted private credentials.
Retrieves secure connection parameters from the Databricks Secret store, including public and encrypted private credentials.
- returns
JSON object containing connection parameters
-
def
convertToObject(j: JsValue): Object
- Attributes
- protected
- Definition Classes
- Util
-
def
coreApiCall(route: String, verb: String = "GET", body: Option[JsValue] = None): JsValue
- Attributes
- protected
- Definition Classes
- Util
-
val
coreUri: String
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
customParameters: JsObject
Custom parameters for the ingestion process, supplied via the source configuration.
Custom parameters for the ingestion process, supplied via the source configuration.
- returns
JSON object mapping custom parameters
- Definition Classes
- IngestionSession → CustomIngest
-
lazy val
datalakePath: String
- Attributes
- protected
- Definition Classes
- CustomIngest
-
lazy val
destFilePath: String
- Attributes
- protected
- Definition Classes
- CustomIngest
-
def
endProcess(parameters: JsObject): Unit
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
environment: String
- Attributes
- protected
- Definition Classes
- Util
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
val
executionStart: Long
- Definition Classes
- App
- Annotations
- @deprecatedOverriding( ... , "2.11.0" )
-
def
fail(message: String): Unit
Fails the ingestion process, marking the process and input as failed and writing a failure log.
Fails the ingestion process, marking the process and input as failed and writing a failure log.
- message
The log message to write along with the failure.
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
lazy val
forceCaseInsensitiveFlag: Boolean
- Attributes
- protected
- Definition Classes
- CustomIngest
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getConnectionParameters: JsObject
- Attributes
- protected
- Definition Classes
- Util
-
def
getProcessSourceName(processId: Int): String
- Attributes
- protected
- Definition Classes
- Util
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
ingest(getDataFrame: () ⇒ DataFrame): Unit
Performs the heavy lifting of the custom ingestion process, executing user code, normalizing the DataFrame, and writing to the DataForge datalake.
Performs the heavy lifting of the custom ingestion process, executing user code, normalizing the DataFrame, and writing to the DataForge datalake.
- getDataFrame
A function that takes no parameters and returns a DataFrame; this will be executed and normalized for ingestion
-
lazy val
inputId: Int
- Attributes
- protected
- Definition Classes
- CustomIngest
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
lazy val
jobRunId: Option[Int]
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
latestTrackingFields: LatestTrackingFields
Latest tracking fields based on the source's most recent inputs.
Latest tracking fields based on the source's most recent inputs.
- returns
LatestTrackingFields instance with optional timestamp, sequence, input ID, and extraction datetime
-
implicit
lazy val
latestTrackingFieldsReads: Reads[LatestTrackingFields]
JSON Reads for LatestTrackingFields used to parse tracking fields from the process response.
-
def
log(message: String): Unit
Writes a log for the ingestion process.
Writes a log for the ingestion process. The logs will be visible in the DataForge UI. Severity will default to "I"
- message
The log message to record.
-
def
log(message: String, severity: String): Unit
Writes a log for the ingestion process.
Writes a log for the ingestion process. The logs will be visible in the DataForge UI.
- message
The log message to record.
- severity
1 character to describe severity. Typically "I" for info or "E" for error.
-
def
logFail(e: Exception): Unit
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
logId: Int
- Attributes
- protected
- Definition Classes
- CustomIngest → Util
-
val
logger: Logger
- Attributes
- protected
- Definition Classes
- Util
-
def
main(args: Array[String]): Unit
- Definition Classes
- App
- Annotations
- @deprecatedOverriding( "main should not be overridden" , "2.11.0" )
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
lazy val
originalProcessId: Option[Int]
- Attributes
- protected
- Definition Classes
- Util
-
val
packageName: String
- Definition Classes
- Util
-
lazy val
pgConnectionLog: Connection
- Attributes
- protected
- Definition Classes
- Util
-
implicit
lazy val
pgConnectionMain: Connection
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
pgConnectionString: String
- Attributes
- protected
- Definition Classes
- Util
-
val
pgConnectionStringRead: String
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
process: JsValue
Starts a new ingestion process in the DataForge backend.
Starts a new ingestion process in the DataForge backend.
- returns
JSON value containing process details and tracking information
- Definition Classes
- IngestionSession → CustomIngest → Util
-
lazy val
processId: Int
- Attributes
- protected
- Definition Classes
- CustomIngest → Util
-
def
runIngest(getDataFrame: () ⇒ DataFrame): Unit
- Attributes
- protected
- Definition Classes
- CustomIngest
-
lazy val
smtpSenderDomain: String
- Attributes
- protected
- Definition Classes
- Util
-
lazy val
sourceId: Int
- Attributes
- protected
- Definition Classes
- CustomIngest
-
val
spark: SparkSession
- Attributes
- protected
- Definition Classes
- Util
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
val
systemConfiguration: JsValue
- Attributes
- protected
- Definition Classes
- Util
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
val
version: String
- Definition Classes
- Util
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
writeLog(message: String, severity: String): Unit
- Attributes
- protected
- Definition Classes
- Util
-
def
writeParsedData(inDf: DataFrame, sourceId: Int, inputId: Int, destFilePath: String, forceCaseInsensitiveFlag: Boolean): (Long, Long)
- Attributes
- protected
- Definition Classes
- Util
Deprecated Value Members
-
def
delayedInit(body: ⇒ Unit): Unit
- Definition Classes
- App → DelayedInit
- Annotations
- @deprecated
- Deprecated
(Since version 2.11.0) the delayedInit mechanism will disappear