Microsoft Fabric environment - A consolidated item for all your hardware and software settings

Microsoft Fabric environment - A consolidated item for all your hardware and software settings

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

hey everyone welcome back this is the public espresso YouTube serious about data engineering data science data ingestion and basically playing with data inside Microsoft fabric the platform that we are building we are part of Microsoft product group working and building Microsoft Fabric and this is a series coming directly from all product manager product engineering teams as well so today we have a special guest suin is joining me to talk about environment artifact environment item but before it suin could you please take a moment and introduce yourself and if you can share what are the features you are working on as I know that you are working also on one more critical item thank youa hello everyone U my name is shin and I'm one of the product manager of fabric data engineering team uh so I work a lot of fields related to the developer experience the item specifically working on esort environment and it is kind of like a Consolidated item for you to manage all your Hardware software setting uh once you want to use spark in Fabric and another era I'm working on is called Library management so if you have like you know Library you want to instore for your development for example in Notebook in Sp definitions and especially in environment and I'm the PM also working on this area thank you that's awesome so now today the topic of the episode is managing spark fire fabric environment item we used to calling that environment artifact but before jumping into the details can you give us the context what is environment when we should use it and for those who are familiar with other big data processing platforms like synapse data bricks how to think about environment knowing those products y okay St uh so enrollment as you can see it is like a item or used to create an artifact in the fabric system it works like any other item in fabric like you know notebook and SP definition you create cre a new environment and within one environment you can see the ability for you to manage your librar backround time and also the computer configuration like for example what's the pool size you want to use what's the compute resources you want to use for W1 and all those configuration can be customized by you and then once it's all set up you can attach it in your notebook and SP drop definition so all the library and configuration will be there once your session gets started so the beauty of having the enrollment is that you can patch the all your configuration Library all together at one place and it can be know fabric system have the G1 deployment pipeline supported and also rest API supported you also have a simless code experience of managing all the configuration through uh the CD and through the API cers experiences and comparing with other well-known systems like you know on synapse we have the similar concept of the pool so in one pool you can also manage your configuration and install libraries in your pool we provide a similar functionality but on fabric we do have a better support of all the CD and also all the cers experience for building your own pip awesome so is that summary correct that in synapse because synaps analytics is a pass Service as a user you had to create a pool configure it and then you could use Apachi spark to process your data in fabric because fabric is a SAS platform you don't have to configure any runtime any hardware switches and provide any sophisticated settings it works out of the box but if you want to customize anything Library spark setting things compute then environment is for you is that correct yes exactly awesome so should we take a look at the demo right now yeah please awesome environment is a Consolidated item for the software and Hardware settings you can personalize your own environment by choosing Spot Run time installing libraries configuring spot computer configurations and properties to apply the configurations and

Segment 2 (05:00 - 10:00)

libraries they need to be published typically this publishing process takes a few minuts when include Library update during this process the computer configurations are validated and the dependen r Treat of the library is constructed to ensure they work correctly in notebooks and Spar drob definitions except for this core functionalities one thing you may have noticed is that there's a building Library section here it's a new feature that will be released soon each Fabric's background time includes different building libraries in Rous versions these building libraries allow you to determine whether library is provided by default and if it matches expected version so next time before installing new libraries we can use this building library to check if it's pre-installed or Not Another new feature which has been released already is the resources folder it provides a file system that enables you to manage small resources during development resources is independent from the configurations you can interact with its folders and files in real time and no publishing is required and you can access environment resources from notebook when attached let's see an example I upload this package in my environment and let's switch to the notebook refresh Explorer and we can see the file is available in my notebook I can drag and drop it into my notebook cell and the code NE it is generated automatically and from The Notebook and I can also manipulate the resources folder of the environment for example I can save this simple data frame to assess wi file in the environment the advantage of this capability is that a single environment can be linked to multiple notebooks it facilitates collaboration across different notebooks within sh storage looks fantastic we could see libraries building public custom we could see compute spark properties and there was one more we could see the resources which are integrated with the no notebook second resources stab could you please share what's the stage is it in public preview or it's GA meaning that ready for production workloads yeah thank you so much so uh basically all the feature you've seen in the envirment except for the building Library that's a feature we're about to release so you might able to see it the next month all the other features are allate future of the environment awesome and could you share what's coming what is the new top that is coming to environment that again this week last week we are discussing almost every day yes exactly so we're about to have the acceleration tabing enrollment and maybe St do you want to do be the one to introduce what other the features we're going to provide so we are getting a new tab under spark compute acceleration for our users who are using native execution engine and also in the future for AOE functionality so right now those two features which are related to acceleration you can enable them by spark settings but for those who are maybe not familiar with setting up spark settings configuring and basically don't want to play with spark settings resolving S names and so on for them we'll have the UI and this is going to be part of the portal experience for environment artifact yeah and this going to be G or public preview ah beautiful question so again uh it will be in public preview the stage again we're adding a tab and inside a tab we are adding an option to enable those features so the maturity of all those features again native engine it will be at the beginning public preview autotune is still in public preview and we have a timeline for Native engine to be in G8 but again right now those two features once the tab acceleration will L the name preview will be reflected for sure so our customers will be informed okay should we uh discuss the next topic which is the fundamental of every prodev experience but also for those people who want to get things done make sure that the developers requirements and rules are met meaning cicd so should we start with yes definitely my workspace is connected to gate rapple with a M branch

Segment 3 (10:00 - 15:00)

and the envirment is synced in Gate the local representations of environment are yo files let's add Public Library as an example I switch to a Dev Branch add emoji and War cloud and then I commit and push the changes from local to get after doing this I can go to the gar repole and create the pr for my changes and once my PR is reviewed the changes can be merged from the dep Branch to men now I can switch back to fabric and sync my environment With The Changes front gate Let me refresh my enironment you can see the changes coming from gate are saved like the changes made in the fabric portal changes front gate also needs to be published to become effective and furthermore Fabric's deployment pipeline simplify the process of delivering modified content across different phes such as moving from the development to test environments are supported as well with a single click of the deploy the environments are automatically synced from one phrase to another one thing that I would like to highlight is that the deployment pipy supports sying published environment if your environment is published in the original workspace it will keep the same in the destination workspace after deploying great demo we started first by seeing what the feature is about how the Integrations look like but now can you tell us more what's the context what are the rules what are the principles behind cic for environment yep thank you Ste uh so as you can see fabric platform will provide you know the G and also deplo pipeline features in the fabric work right now each workspace will be connected to a g reple attached to certain branch and that is how you able to have the G experience based on it and the repo is a Dev OBS rep so with this functionality or like some items of the fabric can be connected into G so that you can have the ability to manage the content of them at your local ID or just you know in the raer itself with this functionality we want to provide the user that ability to manage end to end all like for the enrollment and manage the configurations end to end also one part of the development cycle is about deploy once you think when your development is down you can actually move in from one face to another so that comes with the deployment pipeline it is also a service provided by fabric that you can move your build item from one workspace to another so you can mark one workspace as your test workspace or another workspace as your production workspace once you think all the works are done at one stage you can use a deployment pipeline through the single click of the deploy to move all the things from the test to production and envirment have all the G deployment pipeline support under all the scenarios and I think one more detail I would like to share here that so the changes you made in gate they are we considering they also like the similar with the changes you made in Portal meaning that it's part of the development experience and maybe you want to do some testing you want to try different configurations and you can do them instead of using the portal you can use gate to make sure all those configuration are set up correctly and then after you make the changes in your local and then you get them reflected in your G reple you can go back to the fabric portal and update all the changes from G and all the changes will be reflected in your enrollment and you can further publish the enrollment to make them effective and as for the deployment pipeline we consider this is a pipeline we provide for you to manage from different stages so Advan stages you might have your environment already set up and you might able to use all the configurations in your notebook or inex definitions so all the configuration are kind of like propelled and um compact together and once you deploy from one workspace to another all the contents will be prepared for you meaning that the changes like for

Segment 4 (15:00 - 20:00)

example in the original environment all your changes are already published and already effective they will keep the same destination Works spaces so yeah thank you awesome can you comment if we can bring our own yli fire yes definitely as you can see like you know both in po experience we have the concept of in the public library section you can upload the yo file to specify what are the library you want from pip Anda and also in the spark property section you can upload the yo file for your property key value P so we in the portal we support you to manage all the uh library and settings in a batch and also in the local ID you will see the enrollment are represented as the yo files or the library or the settings or the you know compute and properties are represented in the yo file format so that one way is that you can customize the existing yo and another way is that you can prepare all your yo files and customize everything in one yo file and uploading the photo or syn it in the G reple so that everything can be recognized in the environment awesome so we can create environment then customize it with bringing libraries spark settings Hardware optimization full configuration then everything is reflected with Git you mentioned Azure develops as of now do we plan to support git lab and GitHub yeah that's a great question is so uh for the GitHub is actually on our plan and the gab I will say for even Next Step yeah awesome so now again we covered the function the stage the cicd so there are lots of especially isv cust customers who are heavily relying on rest API do we have rest API for environment and I assume that yes how it works what are the end points yeah thank you so much so yes we do support rest the aps forment um basically all the functionality today you seen in the portal they're all supported through the rest API so the basic rest API will include uh create delete and get the content of the envirment and except for this we also support the API with for example up to library delete Library update the SP computer configuration and that's it so I think the part of supporting rest API are the library Spar round time and Spark computer configurations and resources will be the next component we're considering to support through the API but you know the configuration portion other ones have this API support right now and all the rest API they are under the public preview stage and maybe you can refer to an article which showed in our Microsoft public documentation it show like the ENT experience how you can create an envirment and manage all the contents of the envirment through the apis and another part I want to cover is that we all know that in order to make an envirment become effective we have this process called publish to make all the configuration become ready so in the API we also support you publish the environment and it's a I would call it a really powerful API because in some scenario for example the git experience we just covered after we thinking the changes from git if you want to make sure all the configuration they are working so a publishment is required instead of you know going into the portal and do the publish manually you can also use the API instead you can just call the publish API and then the just will be know processing and it will become ready after a few minutes so publish is a very important AP and also you get another API which is cancel publish definitely you can check the status of the publish at any moment and to say hey if I made something wrong you can cancel the publish if you want yeah so it sounds that again we create the artifact and edit the environment but we have to remember that this API is two stage meaning that addition is one Element saving but then publishing we always have to remember that the changes will be effective only after publishing awesome we'll add a link in the description so we can take a look at the public documentation where all the apis has been listed described categorized and there are a few examples how to for example change a rtime version for your environment also it's a good opportunity to mention that per environment you can

Segment 5 (20:00 - 23:00)

select a different runtime version and then that environment you can attach to a different notebook and different h spark job definition or default workspace environment and here could you please share what's the context behind default environment that we can configure on the workspace level yes that good so uh enironment as a item it is independent from one to another like for example on the one work space you can have multiple environments and each one can be independent from each other so that at one environment you can configure the run time with certain version like 1. 3 and maybe another run time you can have a different run time version with 1. 4 and these two envirment won't be impacted with each other and all your notebook attached to envirment one are going to use all the configuration in one and notebook are using environment to are going to use a configuration of the environment to and another thing mentioned by is a works with day for enironment so once you have the environment ready if you think hey you know I'm not happy of just using this inment for certain notebook I want to make it a general experience for my notebook I can go to the workspace setting and find the section of the environment and you will see there is opportunity for you to attach one envirment to become the workspace default so after you attaching an environment as a default or the notbook can SP definition of this workspace if it's not attached to any other environment by default you will use a workpace setting default meaning that if you create a new notebook create a new SP definition it will automatically have the configuration from the default environment so that you don't need to configure you know the environment one by one in your notebooks you can get you know all the settings in this environment by for your workspace awesome so by default you don't have to configure any runtime related switches settings by default you are getting just a rtime version if you need to tune it configure it then you can create environment and promote it to the workspace level defa environment awesome soin thanks for joining and sharing all those details related to environment item for the those who are watching us remember to leave the like button and if you appreciate our work please leave also a comment ask a question and suggest the name the topic for the next fabric espresso episode until the next time happy configuring and managing Apache spark in fabric through environment item thanks and see you thank you see you

Другие видео автора — Azure Synapse Analytics

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник