🥋 Karate 🥰 ❄️ Snowflake

🇫🇷 Ne ratez pas votre SQL avec un test Karate

🇬🇧 Don’t miss your SQL code with a Karate test

🪄 Once upon a time…​ ✨

A great data pipeline !

data pipeline simple

$ whoami 👨‍💻

A great pipeline…​ but sometimes…​ 😥

Not so great…​ more complex

data pipeline dirty data pipeline complex

💡We need Tests 🤗 & QA 🕵️ 🐛

But also some tools…​

Let’s talk about architecture 👷

ETL vs ELT…​

snowflake text

  • Data Cloud…​ DBMS++

  • Snowpark, Stream/Task…​

  • STORAGE 💰 dissociated from COMPUTE 💰💰💰

  • Scalable COMPUTE

kapoeira text

Kapoeira - Example (1/2)

burger factory

Kapoeira - Example (2/2)

Feature: Burger 🍔 feature

  Background:
    Given input topic
      | topic     | alias        | key_type | value_type |
      | bread     | bread-in     | string   | string     |
      | vegetable | vegetable-in | string   | string     |
      | meat      | meat-in      | string   | string     |
    And output topic
      | topic  | alias      | key_type | value_type | readTimeoutInSecond |
      | burger | burger-out | string   | string     | 5                   |
    And var uuid = call function: uuid

  Scenario Outline: Burger Factory
    When records with key and value are sent
      | topic_alias  | key        | value       |
      | bread-in     | 🧑‍🍳_${uuid} | <bread>     |
      | vegetable-in | 🧑‍🍳_${uuid} | <vegetable> |
      | meat-in      | 🧑‍🍳_${uuid} | <meat>      |
      
    Then expected records
      | topic_alias | key        | value  |
      | burger-out  | 🧑‍🍳_${uuid} | result |

    And assert result $ == "<output>"

    Examples:
      | bread | vegetable | meat | output       |
      | 🍞    | 🍅        | 🥩   | 🍔           |
      | 🍞    | 🍅        | 🍗   | 🍔           |
      | 🍞    | 🍅        | 🐟   | 🍔           |
      | 🍞    | 🥕        | 🥩   | 🍞 + 🥕 + 🥩 |

karatelabs

Karate - Example

Feature: karate 'hello world' example
  
  Scenario: create and retrieve a cat
    Given url 'http://myhost.com/v1/cats'
    And request { name: 'Billie' }
    When method post
    Then status 201
    And match response == { id: '#notnull', name: 'Billie' }
    
    Given path response.id
    When method get
    Then status 200

Karate - HTTP
Snowflake - JDBC SQL
🤔

Karate/Snowflake - Example

Feature: SELECT
  Background:
    * url "https://<SNOWFLAKE_ACCOUNT>.snowflakecomputing.com/api/v2"
    * def jwt = karate.exec("snow connection generate-jwt --silent --account <SNOWFLAKE_ACCOUNT> --user <SNOWFLAKE_USER> --private-key-file <SNOWFLAKE_USER_PEM>")
    * def bearer = "Bearer " + jwt
  Scenario: Select 1 cutter
    Given path "statements"
    And header Content-Type = "application/json"
    And header Accept = "application/json"
    And header Authorization = bearer
    And header User-Agent = "<MY_APP_IT>"
    And header X-Snowflake-Authorization-Token-Type = "KEYPAIR_JWT"
    And def payload = 
        """
        {
          "statement": "SELECT SERIAL_NUMBER, CUTTER_TYPE FROM CUTTER WHERE SERIAL_NUMBER='MY_VECTOR'",
          "timeout": 60,
          "role": "<SNOWFLAKE_ROLE>",
          "warehouse": "<SNOWFLAKE_WAREHOUSE>",
          "database": "<SNOWFLAKE_DATABASE>",
          "schema": "<SNOWFLAKE_SCHEMA>"
        }
        """
    When request payload
    And method post
    Then status 200
    And match response.resultSetMetaData.numRows == 1
    And match response.resultSetMetaData.rowType[0].name == "SERIAL_NUMBER"
    And match response.resultSetMetaData.rowType[1].name == "CUTTER_TYPE"
    And match response.data[0][0] == "MY_VECTOR"
    And match response.data[0][1] == "VECTOR"

Complex for a SELECT…​😠
Run Task & DBT ?🤔

data pipeline elt
  • Karate DSL extension - Examples

  • karate.exec(<any_command>)

  • 😃

💡Lectra Karate Tool
💪😎👍

  • Fat JAR (karate.jar + extensions) or Docker Image (with all you need)

  • Retry / timeout configuration

  • Clone / Drop schema for each Scenario

  • Other advanced features

  • 😃

Karate/Snowflake - Example V2

Feature: SELECT
  Background:
    * url baseUrl
    * def jwt = snowflake.generateJwtToken(snowflakeCliConfig)
    * configure headers = snowflake.requestHeaders(jwt, projectName)
  Scenario: Select 1 cutter
    Given string statement = "SELECT SERIAL_NUMBER, CUTTER_TYPE FROM CUTTER WHERE SERIAL_NUMBER='MY_VECTOR'"
    And def response = snowflake.runSql({statement: statement, snowflakeConfig: snowflakeConfig})
    And table expectedData
      | SERIAL_NUMBER | CUTTER_TYPE |
      | "MY_VECTOR"   | "VECTOR"    |
    And match response.data == expectedData

+ karate-config.js for configuration

Demo Time 🎬

Kafka version
burger factory

Snowflake version
burger factory

burger-factory.feature

Feature: Demo
  Background:
    * url baseUrl
    * def jwt = snowflake.generateJwtToken(snowflakeCliConfig)
    * configure headers = snowflake.requestHeaders(jwt, projectName)

  Scenario Outline: Burger Factory
    Given string clientId = common.uuid() 
    And table inserts
      | snowflakeConfig          | table     | value       |
      | snowflakeConfigBread     | BREAD     | <bread>     |
      | snowflakeConfigVegetable | VEGETABLE | <vegetable> |
      | snowflakeConfigMeat      | MEAT      | <meat>      |
    And def callInsert = function(row) { return karate.runSql({ statement: "INSERT INTO "+row.table+"(CLIENT_ID, VALUE) VALUES ('"+clientId+"','"+row.value+"')", snowflakeConfig: row.snowflakeConfig}).responseStatus }
    And def responses = karate.map(inserts, callInserts)
    And match responses contains only 200
    
    When def dbtResponse = karate.exec("dbt run ...")
    And match dbtResponse == "OK"

    Then string selectStatement = "SELECT VALUE FROM BURGER WHERE CLIENT_ID='"+clientId+"'" 
    And def response = karate.runSql({ statement: selectStatement, snowflakeConfig: snowflakeConfigBurger })
    And match response.data == [ { "VALUE" : "<output>" } ]

    Examples:
      | bread | vegetable | meat | output       |
      | 🍞    | 🍅        | 🥩   | 🍔           |
      | 🍞    | 🍅        | 🍗   | 🍔           |
      | 🍞    | 🍅        | 🐟   | 🍔           |
      | 🍞    | 🥕        | 🥩   | 🍞 + 🥕 + 🥩 |

Next Steps 🚀

  • Open Source

🎉 Thank you 🙏

Questions ?

Feedback TODO