Frequently Asked Questions | FAQs | Monolith

How much data do I need to properly train a model?

The data requirements for training machine learning models vary depending on the use case and your objective. For example, you can get started with a small amount of data if your goal is to optimize a test plan using the iterative, active learning approach in the Next Test Recommender. However, if you need to calibrate a highly non-linear and multivariate system to a high level of accuracy, you will need a lot of data.

Additional Resources:

Blog: How much data do I need to use AI?

How much data do I need?

10 designs or test results: All you want to do, is to make sure that the next engineer you hire has access to the previous design files and information you gathered to get up to speed faster. This can be work with 3 data points or designs. To give a simple example: let's say an aerodynamicist just joined a racing company and needs to design a spoiler for a new car. The first step they will do is to have a look at the last 3 cars the company to learn from them. You can make this search and learn process a lot easier if you save the data in interactive 3D + functional data dashboards than by going through old folders. You can also solve it by creating 3 very neat folders and making sure the data in them is structured neatly so that people can compare future cars, so no need to use AI here.
50 designs or test results: You want to use insights from product testing or development to help your engineers make better decisions faster. You have noticed that there is considerable repetition, and from the last 50 projects you have done you can learn quite a few things using algorithmic methods. You can detect correlations, you look at failure scenarios, you can build simple models to make recommendations of what to test next. At this data level, algorithms can be a nice extension of the engineering expertise you already have.
150 designs or test results: You can make build an AI model to predict the result of tests or physical simulations. For example, you could predict the performance of a rim in a wind tunnel test, predict the maximum stress in a suspension system, etc. The prediction results will be not great if the problem is hard, and mostly we see people use AI at this data level for faster decision-making at the pre-design stage and they build models based on their own test data that is not biased by simplifying physical assumptions.
250 designs or test results: You can build recommender systems that can deliver really useful insights into what other solutions you could try. You can run targeted optimisation codes that tell you how to design things differently. This is the typical size of 'design of experiment' for optimisation studies based on CAE models, so we tend to get a lot of those at this data level.
500 designs or test results: You can build AI models that can predict the outcome of repetitive processes with good accuracy - sometimes as good as or even better than simplified physical simulations. This tends to be really beneficial for companies as they end up saving a lot of time and money on performing tests or running simulation if they can prove the use of an AI model for this scenario.
1000 designs or test results: You can build fully automated workflows. This is every engineering CIO or CTOs dream. Imagine this: a customer provides your team with their requirements, and you enter them into an online form where an AI algorithm will go into your PDM system and create a new product or component for these requirements fully automatically. I have seen this work for repetitive components like sealing solutions, pumps, bearings etc. so in general for suppliers who create many 1000 versions of the same component every year.

3D vs Tabular data, which one works best?

Both 3D and tabular data can be useful for AI depending on the specific task and the nature of the data. 3D data is typically used in generative design tasks on Monolith, where the input data consists of geometry information.

Tabular data, on the other hand, consists of data arranged in rows and columns, similar to a spreadsheet. This type of data is commonly used in machine learning tasks such as predictive modelling and classification, where the goal is to predict a target variable based on a set of input features.

In these tasks, various machine learning algorithms such as decision trees, random forests, and neural networks can be used to analyze and learn from the tabular data. In summary, the choice between using 3D or tabular data depends on the nature of the task and the type of data available.

Example where we worked with both file types: Kautex Use Case

Numerical vs categorical data?

The type of data used in an AI task depends on the nature of the problem and the type of model being used. For example, numerical data is often used in regression tasks, where the goal is to predict a continuous output variable based on input features. In this case, algorithms such as linear regression or neural networks are used to analyse and learn from the numerical data.

Categorical data, on the other hand, is often used in classification tasks, where the goal is to predict a categorical output variable based on input features. In this case, algorithms such as decision trees, random forests, or support vector machines are used to analyse and learn from the categorical data.

It is also possible to convert categorical data to numerical data using techniques such as one-hot encoding, where each category is represented as a binary value. This allows categorical data to be used in algorithms that require numerical input.

In summary, both numerical and categorical data have their uses in AI, and the choice of data type depends on the nature of the problem and the type of model being used.

Do you support categorical data as well as numerical?

Monolith supports both numerical and categorical data as inputs to a model.

Additional Resources:

Documentation: Data Types Step

Is there a limit to how much data I can import?

We can't give a concrete number - 3D data will depend on mesh sizes. Tabular will depend on models you want to train, number of rows vs number of columns etc.

We can always refer the prospect to the Software team.

In general: We do not stop users from uploading data!

Which platforms do we interact with (API’s)?

The Monolith API is a RESTful interface that can be provided to customers to allow them to automate their workflow. Some customers are currently using it to provide third party access to Monolith models, but it’s flexible and can be used for many applications.

An example is collecting test data in a SQL database. This database is not directly accessible but a customer built a Web interface which provides tools for users to filter their data and the query is built and sent to the database in the background. Currently they download the data from this web interface to their machine to upload the data to Monolith platform. One could technically add a web interface which sends the data retrieved from the database to the Monolith S3 bucket via API call.

Our current user is a consumer of this web interface. Therefore he needs to submit a request for change. To present the IT guys a proof we would like to start with an API push to our AWS storage from CLI so that he has a prototype to demonstrate.

Is there an installation process, if yes, how does that work?

We will provide access to the Monolith platform for your users. Monolith will be hosted in the environment agreed in the contract that best secures your data and enables the tool to be accessible for the engineers and easily integrated into your workflow. IT teams from Monolith AI and yourselves will need to be involved from the beginning of the project to ensure that your security needs are identified and addressed quickly and to sufficiently scope your storage requirements and cloud or on premise configuration to enable a smooth and speedy installation.

For a dedicated installation guidance, feel free to contact our experts.

Data ownership – who owns the data after being imported into the platform?

Data is and will always be 100% yours! The same goes for your IP.

You are always able to give our team access to set up or debug notebooks.

Monolith is a development platform with model architectures ready to be trained using your data. Any data uploaded to the platform and the models trained from it are owned by the customer.

How do we know if the case study is feasible?

The 3 critical Feasibility milestones are:

Understand the Customer: We assess our customers’ existing workflows, processes and programmes, as well as the availability and quality of engineering data for AI modelling.
Establish the Value: We identify our customers’ pain points with their status quo and scope our solution to resolve these pain points and provide quantified business value to the organisation.
Present a Proposal: We present a project proposal to the stakeholders within our customers’ organisation and communicate the benefits and ROI that they can gain by partnering with Monolith to adopt AI in their engineering to test less and learn more.

---

The Monolith team will support you with the following points.

Define the problem: We will start by defining the problem statement clearly. What is the business problem that you are trying to solve? What are the goals of the project, and how will you measure success? It is essential to have a clear understanding of the problem before moving forward.

Gather and evaluate data: Collect data that is relevant to the problem at hand. The quality and quantity of the data are critical factors in determining feasibility. Evaluate the data to ensure that it is representative of the problem, and that it is accurate, complete, and consistent.

Determine the AI approach: Based on the problem definition and data, determine the most appropriate AI approach to use. For example, if the problem involves classification, then a decision tree or a neural network may be appropriate. If the problem involves forecasting, then time-series analysis or regression techniques may be more suitable.

Assess technical feasibility: Once the AI approach has been identified, assess the technical feasibility of the solution. Are there any technical constraints that need to be addressed?

Evaluate economic feasibility: Evaluate the economic feasibility of the solution. Estimate the costs of implementing the AI solution, including data acquisition and processing, hardware and software costs, and ongoing maintenance and support. Assess the potential benefits and return on investment (ROI) of the solution.

Test the solution: Test the AI solution to determine if it meets the defined objectives and requirements. Evaluate the accuracy and performance of the solution using appropriate metrics.

Can I import data directly from other tools/software/platforms?

You can import data directly into the Monolith platform in two ways:

Using SQL queries to import data from a database
Using the Monolith API to automate the process of loading, running, and returning results from a notebook or dashboard

If you want to take advantage of these capabilities, our team will work with you to go through the initial setup and guide you in your first implementation.

Additional Resources:

Support Article: How to use the Monolith API?

Can I use Monolith to transform and clean my data?

Monolith has a library of built-in steps to clean and transform your data, removing unnecessary factors or extracting new factors (e.g. take a derivative of a time series variable) that may be useful as an input to a model. For example, you can find gaps, ranges, and common values in your data sets to quickly assess if more preparation is needed before modeling.

For tabular data, Monolith has a wide range of steps to manipulate and transform your data, including steps to join tables, append datasets, add columns, transform tables (from wide to long and vice versa), or group data to perform calculations on different subsets and combinations. For time-series test data, you can restructure data sets to align and normalize them to a common sampling frequency. Or, you can quickly identify specific attributes of a time-series data set to find the amplitude, range, RMS and mean values, derivatives, moving averages, and more.

In addition to the built-in steps for transforming your data, you can also use the Custom Code step to perform specific data manipulations that are not available in built-in Monolith steps.

Additional Resources:

Documentation: Transform Capabilities
Documentation: Time Series Feature Extraction
Documentation: Custom Code

What Monolith tools are available for exploring my data?

Monolith has a number of built-in tools to explore and visualise your data to gain valuable intelligence before using it to train models. During exploration, you can use the 2D or 3D visualisation plots to identify outliers in the data, check for noise within the data, check if the distribution of data is evenly spread across the design space and identify any relationships between factors that can inform your choice of model.

For more complex data sets with multiple parameters, where a simple 2D or 3D plot may be difficult to interpret, you can use more advanced visualization tools to find non-trivial relationships in your data:

Parallel Coordinates – An interactive plot to help you understand relationships between 5 or 10 different variables
Intelligent Correlation – Helps you identify which inputs are most likely to influence your outputs

We also have feature extraction functions to analyze time-series data to identify attributes such as amplitude peak, range, RMS, mean values, moving averages, and so on.

Additional Resources:

Documentation: Explore Capabilities
Documentation: Time Series Feature Extraction
Documentation: Custom Code

What modeling algorithms are supported in Monolith?

Monolith has a wide range of modeling algorithms built into the platform, including simple models like Polynomial and Decision Tree Regressions, and more complex models, like Gaussian Process Regressions or Neural Networks.

Although data scientists may understand when to apply these different types of models, we recognize that engineering domain experts may not be comfortable working with these algorithms and choosing the one most appropriate. To help, Monolith has built-in bulk modeling tools to help you choose the right algorithm for your data. (see next question for further explanation).

How do I know I’m using the best model for my application?

With the built-in Bulk Modeling feature, you can specify different aspects of your data to get recommendations for which models are most effective.

With Bulk Modeling, you describe key characteristics of your data or design to be modeled, and the software recommends which modelling algorithms will work best.

Additional Resources:

Documentation: Bulk Modeling

How do I know the model I create is accurate?

Model accuracy can be used as a proxy for your understanding of the product or system behavior. Monolith has several model evaluation tools available to help you quantify the accuracy of your model to determine if it is “good enough” to be used in production. When you specify the set of data in the platform, it is good practice to split the data and reserve a small percentage (typically 20%) to evaluate the model. When your model is trained on the rest of the data (training set, usually 80%), you can run the test set through the model and assess the prediction errors using the Predicted vs Actual evaluation step, or analyze your model results against specific performance criteria using the Compare Against Criteria step.

Additional Resources:

Documentation: Compare Performance Metrics
Documentation: Predicted vs. Actual

Can I use models I create in Monolith in other tools?

Monolith models can be used in other tools through a few mechanisms:

Using Monolith APIs, you can pass data through your models in the Monolith environment from a separate application.
Some customers have built models and exported specific model parameters used as coefficients for embedded algorithms running on hardware

Monolith does not currently export models directly for embedded hardware use cases.

Can my data science team benefit from using Monolith?

Engineers who use Monolith are more self-sufficient because they can build models based on the test data that they know and fully understand. This allows your data scientists, who are often a scarce resource, to assist on advanced modeling questions without being involved as much in the implementation details.

In some cases, the data science team will use Monolith as a fast prototyping tool to quickly find the best type of model for their given challenge. We’ve also seen examples where the data science team uses the dashboarding capabilities in Monolith as a mechanism for hosting and gaining adoption for their modeling work within the engineering workflow

Is there an installation process for Monolith?

Monolith is cloud-based, so there is nothing to download or install on local computers. We have a shared AWS cloud offering as well as customer-specific AWS cloud options to address your security needs.

In addition, we work hand-in-hand with your team so we can scope the storage and computational resources needed for the project.

How can my team get started with AI and Monolith?

When you engage with Monolith, we offer a structured Discovery and Feasibility process to understand your business goals, the state of your data, and your development and testing processes.

We work closely with your team to prioritize, understand, and agree on the potential business value of the project, the technical approach, and the key roles and responsibilities needed to execute the project successfully. Contact us to get started.

What are the benefits of using a “No Code” AI tool like Monolith over traditional python programming?

Monolith is a commercial platform that can help engineers build scalable, maintainable, and efficient AI solutions very quickly. Both technical and business considerations factor into you’re decision to purchase a commercially-available platform like Monolith vs. building your own solution.

For mission-critical uses of AI, Monolith has advantages for the individual engineer and the engineering department leaders to help establish and sustain valuable AI models as part of your development process.

Additional Resources:

Blog Post: Build AI vs. Buy AI Engineering Dilemma

Can I incorporate my own custom code into Monolith?

In addition to the Custom Code step, in which you can add python code to a notebook, we can work with you on other possibilities to add your existing python code into the platform.

How can I share my work with others in my team/company?

Monolith is built for collaborative use across large, enterprise-level engineering teams. The platform offers a wide range of features designed for sharing, capturing and sharing knowledge across the organization, including:

Dashboards - summarize your data science notebooks with an interactive dashboard that enables other users to modify inputs, load new data, and see how the model-based predictions change
Team Features – organize your data and notebooks based on your team structure. Control access and capabilities with role-based logins.
API – integrate with your existing processes using the Monolith API. Automate data flowing from your design tools or test stations directly into Monolith notebooks and dashboards to process data, generate new test conditions, and more.

Additional Resources:

Web Page: A Platform for Sharing Knowledge
Web Page: Built to Complement Existing Workflows

How can I manage projects within my team/company in Monolith?

In Monolith, you can organize your team’s data space and notebooks into private folders to allow for team access and security to avoid confusion or overlapping projects.

Additional Resources:

Documentation: Teams Capabilities
Documentation: Dashboard Capabilities