
[ad_1]
Build an Artistic Camera and See How It Performs on the A13 Bionic Chip’s Neural Engine
Style Transfer is a very popular deep learning function that lets you change the structure of one image by applying a visual style to another image.
From creating artistic photo editors to revamping your game design through cutting edge themeWith Neural Style Transfer Models you can create a lot of amazing things. It can also be easing or data augmentation.
At WWDC 2020, create ML (Apple’s model building framework) has received a major boost with the inclusion of the Style Transfer Model. Although it shipped with the Xcode 12 update, you will need macOS Big Sur (in beta at the time of writing) to train the style transfer model.
Create ML has now unlocked the ability to train Style Transfer Models directly from your MacBook. You can train both image and video style transfer convolutional neural networks, the latter using only a limited set of convolutional filters to optimize it for real-time image processing.
To get started, you need three things:
- A styling image (also called a style reference image). In general, you can use famous paintings or abstract art images to learn and style your model. In our case, we will use a pencil sketch image in our first model (see screenshot below).
- A validation image that helps visualize the quality of the model during the training process.
- Dataset of content images that serve as our training data. To get optimal results, it’s best to work with the directory of images as you would use when running a guess.
In this article, I will use This Celebrity Image Dataset Our content in the form of images.
Here’s a glimpse of how my ML Style Transfer Settings tab looks before I train the model.

The validation image below shows a real-time style transfer applied at each iteration interval. Here is a glimpse of it:

It is worth noting that the style loss and content loss are indicators to understand the balance between graph style and content images. Typically, the loss of style should decrease over time, indicating that the neural network is learning to adapt to the artistic traits of the style image.
While the default model parameters work great, Create ML allows us to customize them for specific use cases.
The Low Style Strength parameter tuned only parts of the background to the stylized image, leaving the primary subject intact. Also, setting the Style Strength parameter to High will add more style texture to the edges of the image.
Similarly, the coarse-style density style uses high-level details of the image (such models are trained much faster), while the fine-density allows the model to learn the minute details.
Build an ML style transfer model train with the default number of iterations set to 500, which is ideal for most use cases. Iterations are the number of batches required to complete an epoch. One epoch is equal to one training cycle for the entire dataset. For example, if the training dataset contains 500 images and the batch size is 50, this indicates that an epoch will be completed in 10 iterations (note: build ML models training that does not tell you the batch size).
This year, Create ML also introduced a new feature called Model Snapshot. These allow us to capture intermediate Core ML models during training and export them to your apps. However, the models used from the snapshots are not optimized for size and are significantly larger than the models generated upon completion of training (specifically, the size of the Core ML model in the snapshots I took is in the range of 5-6 MB) while the size of the final model was 596 kb).
The following gif shows one such example, where I have compared model snapshot results in different iterations:

Notice how on one of the images, the style doesn’t build up on the whole image. This is because the style image used was of a smaller size. Therefore, the network was not able to learn enough style information, causing the image formed to be of sub-par quality.
Ideally, having a minimum size of 512 px for the style image will ensure good results.
In the following sections, we will create an iOS application that runs on the Style Transfer Model. real time. Here’s a bird’s eye view of our next steps:
- Analyze the results in three video style transfer neural network models. One of them is trained with the default parameters, and the other uses the style strength parameters set as high and low.
- Implement a custom camera using AVFoundation in our iOS application.
- Play the generated Core ML model on the live camera feed. We’ll use a vision request to quickly move, predict, and create stylized camera frames across the screen.
- View results in CPU, GPU and Neural Engine.
Finding a style image that produces good artistic results is difficult. Luckily, i got an image like this With a simple Google search.
I have already trained three models with the same dataset. Here are the results:

As you can see above, the low-powered model hardly affects the content images with the given style image, but the high-powered one refines the edges with more style effects.
And with that, we have our models (about half a MB in size) ready to ship in our app.
Create ML also lets us preview video results, but it’s incredibly slow. Luckily, we’ll see them in real-time in our demo app shortly.
AVFoundation is a highly customizable Apple framework for media content. You can create custom overlays, fine-tune camera settings, perform photo segmentation with depth output, and analyze frames.
We’ll focus primarily on analyzing frames, and in particular transforming them using Style Transfer, and displaying them in an ImageView to create a live camera feed (you can also use Metal for further customization) can, but for the sake of simplicity, we will leave it for this tutorial).
At a very basic level, building a custom camera involves the following components:
AVCaptureSession
– It manages the entire session of the camera. Its functionality includes gaining access to iOS input devices and passing data to output devices. AVCaptureSession also lets us definePreset
Types for different capture sessions.AVCaptureDevice
— Let’s select the front or rear camera. We can either choose the default settings or useAVCaptureDevice.DiscoverySession
To filter and select hardware-specific features, such as TrueDepth or wideangle cameras.AVCaptureDeviceInput
— It provides the media source from the capture device and sends it to the capture session.AVCaptureOutput
– An abstract class that provides output media to the capture session. It also lets us handle the camera orientation. We can set multiple outputs (like for camera and microphone). For example, if you want to capture photos and movies, addAVCaptureMovieFileOutput
AndAVCapturePhotoOutput
, In our case, we will useAVCaptureVideoDataOutput
Because it provides video frames for processing.AVCaptureVideoDataOutputSampleBufferDelegate
There is a protocol we can use to access every frame buffer insidedidOutput
delegate method. To start receiving frames, we need to invoicesetSampleBufferDelegate
method onAVCaptureVideoDataOutput
AVCaptureVideoPreviewLayer
– basically aCALayer
Which visually displays the live camera feed from the output of the capture session. We can replace the layer with overlays and animations. It is important to set this for the sample buffer delegate methods to work.
To start, add NSCameraUsageDescription
Camera permission in your project info.plist
file in Xcode.
Now, it’s time to make one AVCaptureSession
In ViewController.swift
,
let captureSession = AVCaptureSession()
captureSession.sessionPreset = AVCaptureSession.Preset.medium
Next, we’ll filter and select the wide-angle camera from the list of available camera types. AVCaptureDevic
example and add it to AVCaptureInput
which in turn is set to AVCaptureSession
,
Now that our input is set up, let’s add our video output to the capture session:
let videoOutput = AVCaptureVideoDataOutput()videoOutput.alwaysDiscardsLateVideoFrames = truevideoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "videoQueue"))if captureSession.canAddOutput(videoOutput){
captureSession.addOutput(videoOutput)
}
alwaysDiscardsLateVideoFrames
The property ensures that late frames are dropped, ensuring that latency is reduced.
Finally, add the following piece of code to stop the rotated camera feed:
guard let connection = videoOutput.connection(with: .video)
else { return }guard connection.isVideoOrientationSupported else { return }
connection.videoOrientation = .portrait
Note: To ensure all orientations, you need to set
videoOrientation
Based on the current orientation of the device. The code is available at the end of this tutorial.
Finally, we can add our preview layer and start the camera session:
let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)view.layer.addSublayer(previewLayer)captureSession.startRunning()
here’s a look at us configureSession()
The method we just made:

Now, to the machine learning part. We’ll be using the Vision framework to take care of the input image pre-processing for our style transfer model.
By customizing our view controller AVCaptureVideoDataOutputSampleBufferDelegate
protocol, the following method can access each frame:
From the sample buffer instance above, we can do a . will retrieve CVPixelBuffer
Example and pass it to the vision request:
VNCoreMLModel
acts as a container, inside which we have instantiated our Core ML model in the following manner:
StyleBlue.init(configuration: config).model
config
is an example of type MLModelConfiguration
, It is used to define computeUnits
property, which lets us set cpuOnly
, cpuAndGpu
either all
(neural engine) To run on the desired device hardware.
let config = MLModelConfiguration()switch currentModelConfig {
case 1:
config.computeUnits = .cpuOnly
case 2:
config.computeUnits = .cpuAndGPU
default:
config.computeUnits = .all}
Note: We have set a
UISegmentedControl
UI control that lets us switch between each of the above model configurations.
VNCoreMLModel
is passed inside VNCoreMLRequest
request, which returns comments of type VNPixelBufferObservation
,
VNPixelBufferObservation
is a subclass of VNObservation
which gives the image output of CVPixelBuffer
,
Using the following extension, we convert CVPixelBuffer
in a UIImage and draw it on the screen.
Oh! We have created our own real-time style transfer iOS application.
Here are the results when the application was running on the iPhone SE:



Note, how when running on Neural Engine, style transfer predictions are close to real time.
Due to GIF size and quality constraints, I also made a video Which shows a real-time style transfer demo in action on CPU, GPU and Neural Engine. It’s a lot easier than the GIF above.
You can find full source code of above application with Core ML Style Transfer Model In this GitHub repository,
Core ML Introduced in iOS 14 model encryption, so theoretically I could defend this model. But in the spirit of learning, I have chosen to offer the above models for free.
The future of machine learning is clearly no-code, with platforms like makemlAnd Apple’s Create ML is a leader in providing easy-to-use tools and platforms for quickly training mobile-ready machine learning models.
Create ML also introduced model training support for human activity classification this year. But I believe style transfer is one that will be adopted by iOS developers very quickly. If you want to create a single model containing multiple stylized images, use your create,
Now you can build amazing AI-based, Prisma-like applications with Neural Style Transfer Neural Networks at absolutely zero cost (unless you decide to upload your app to the App Store!).
Style transfer can also be used on ARKit objects to give them a completely different look. We will cover this in the next tutorial. stay tuned.
That’s all for it. Thanks for reading.
[ad_2]
Source link
#Build #run #style #transfer #models #iOS #Camera #application #Anupam #Chughu