@Enrico Rotundo has joined the channel
Hi all! I’ve created this channel to chat about bringing Open Lineage to Bacalhau (https://www.bacalhau.org/)
the idea would be at every step of a DAG execution, we automatically read the inputs (if any) and create a metadata file with open lineage information in it
Enrico has done some early thinking on this - https://www.notion.so/pl-strflt/Initial-design-doc-Oct-22-d2b032bd16e340d3ada39171c9ad524d
and in parallel created an Airflow operator -> https://github.com/enricorotundo/bacalhau-airflow-provider
hey @Julien Le Dem glad to meet you! The TLDR; is we’re going to use Airflow to orchestrate Bacalhau pipelines, and would love to add open-linage to pipelines too. I see Marquez integrates already with Airflow so that may be a good fit!
Hello @Enrico Rotundo nice to meet you as well. FYI we have the monthly meeting on zoom tomorrow if you guys want to join.
https://openlineage.slack.com/archives/C01CK9T7HKR/p1670432586277209
thanks for sharing that! I’ll join you!
Hi @Julien Le Dem As I’m starting to build an airflow operator for Bacalhau (which will include support for OpenLineage 🙂), I was wondering if you could share your knowledge about building operators. Why did you place the current operator in the OpenLineage repo rather than raising a PR to the “official” community-built Airflow repo? ~Is there any specific reason (community guidelines, technical, etc.)?~
*Thread Reply:* Oh now that I’m reading your new operator draft AIP it all makes sense
*Thread Reply:* ok so at this point the question is… for newborn operators, do you suggest to start with their own package or try to merge into airflow.providers directly?
*Thread Reply:* I think you can create your own Provider package with your operator. This is more a question for the airflow mailing list.
*Thread Reply:* I would recommend this to add lineage support for your operator: https://openlineage.io/docs/integrations/airflow/operator/ https://openlineage.io/docs/integrations/airflow/extractors/default-extractors
*Thread Reply:* And nice to heare from you @Enrico Rotundo!
*Thread Reply:* The next monthly meeting is tomorrow if you want to join
*Thread Reply:* I added you to the invite just in case
*Thread Reply:* To answer your original question, we started outside the Airflow community, now that the project is more established it is easier to get this approved. Hopping to get this to a vote soon
@patrice quillevere has joined the channel
@Nicolas Pittion-Rossillon has joined the channel
@Farnaz Salamatjoo has joined the channel
@Rishabh Pareek has joined the channel