Pricing and execution insight from structuring diverse quote data

In March, Eugene Grinberg, New York-based co-founder and chief executive of SOLVE, visited Australia around his firm’s sponsorship of the KangaNews Debt Capital Market Summit in March. Grinberg talked to KangaNews about how SOLVE’s offering is helping investors and dealers make sense of the deluge of pricing signals they receive and send to help promote price transparency even in some of global fixed income’s most illiquid and opaque asset classes.

The SOLVE offering is not trying to change how market participants trade – it is not a trading platform – but aims to maximise the value of trade-related data, like the abundance of price quotes, which typically are not used systematically. What is the value proposition for a buy- or sell-side user?

Ultimately, at the highest level, the value proposition is the same for both sides: it is about adding pre-trade price transparency to illiquid markets, allowing users to save time researching price points and reducing the risk of information asymmetry. These value propositions are universal.

But the nuances are slightly different, depending on the user. This is because there is an information asymmetry: the buy side is slammed with messages from dealers, which is often too much data to manage in any meaningful way. The sell side, meanwhile, is expected to provide market colour to the buy side – but it is often blind to what the rest of the street is doing.

For the buy side, part of day-to-day life is receiving lots of messages from trading counterparties – often many thousands of them, and some of our clients are receiving hundreds of thousands each day. This is an overwhelming amount of information that clearly they are not able to consume in any meaningful fashion, because it is all unstructured.

A key element of our offering for the buy side, therefore, is the message parsing technology we have developed. This is all built in-house, and is driven by machine learning and natural language processing. It reads these many thousands or hundreds of thousands of messages a day and extracts meaningful data points.

These generally fall into two categories. One is all the quotes data – both real-time and historical-time series. These could be one- or two-sided markets or any other indications of price chatter or ‘market colour’.

The second category is indications of bonds that are actionable in the market – that are actually available to trade, as indicated by things like dealers making inventory public or marketing bonds on behalf of clients. This is very meaningful, because many bonds don’t trade for days, weeks or months at a time.

We offer a number of workflow tools, which will alert the user to things that are happening. They may have identified a bond as interesting – or that they have interest at a certain price or yield – six weeks ago, and the system will let them know it is available in the market today.

Moving to the sell side, as I mentioned the risk is that the pricing information dealers put out could be off-market – because they don’t see nearly as much information as their buy-side counterparties. The second element of our product is the crowdsourced dataset, which is based on anonymous contributions from buy- and sell-side participants.

This is helpful for the buy side as many participants want to have access to an overlay of quotes that they see from their counterparties in addition to the anonymous data they are not seeing directly. But it is especially helpful for the sell side, because they can now have much more confidence in their price formation in addition to the significant time savings on tasks like researching quotes or asking other market participants for colour.

This obviously relies on users being willing to share their data, albeit on an anonymised basis. Philosophically, why would – or would not – a user want to share their quotes? An investor might feel they are being shown best prices, for instance – why would they want to incorporate these into a market average?

Users are not required to contribute their information. We have a good number of firms that want to be able to digitise their unstructured messages and keep the information to themselves. Many others believe transparency is good for markets and are happy to share.

When establishing a presence in a new market – like Australia – presumably it takes some time for the crowdsourcing aspect to build enough inputs to reach critical mass and thus to produce reliable price guidance for users. How do you manage – and encourage takeup – in the interim?

It is more about new asset classes than new geographies, because we already deal with a lot of global firms on the buy and sell sides – and once they contribute their data, our clients will benefit from global coverage.

We are active in eight asset classes in fixed income, though some of these can be divided into perhaps 20 sub-asset classes. The main categories are corporate bonds, syndicated bank loans, convertible bonds, credit default swaps, securitised products, municipal bonds, credit indices and private credit.

With each additional contributor, we get a sense of what marginal value they are bringing. We have found that the parser’s ability to extract an enormous amount of information from unstructured messages means we are able to get to critical mass of coverage in a new asset class fairly quickly.

Where are you starting in Australia, and do you foresee any specific challenges or opportunities locally?

We believe our offering is immediately applicable in corporate bonds – financials and true corporates – and Australian RMBS [residential mortgage-backed securities]. As I mentioned, a lot of our clients are large, global asset managers and therefore their data will span the global fixed-income market even if the relationship with us typically started in the US or Europe. This means we have had coverage of global asset classes for some time.

It is also important to note that, when firms choose the parsing solution, their coverage will be as good as the messages they are receiving. This said, we are certainly fairly new to the Australian market. We are excited to forge new relationships and continue learning about the market structure and best ways of servicing Australian clients. All the best product ideas come from client conversations.

This said, we are certainly fairly new to the Australian market. We are excited to forge new relationships and continue learning about the market structure and best ways of servicing Australian clients. All the best product ideas come from client conversations.

“There is an information asymmetry: the buy side is slammed with messages from dealers, which is often too much data to manage in any meaningful way. The sell side, meanwhile, is expected to provide market colour to the buy side – but it is often blind to what the rest of the street is doing.”

You mentioned the idea that the SOLVE offering adds transparency in illiquid markets. We tend to assume that a lot of Australian credit product does not turn over as much as equivalent bonds in global core markets. Is this a hurdle?

A good amount of the markets we cover globally are fairly illiquid. Our first asset class was securitised products, and this was in 2011 – on the heels of the great financial crisis. In other words, we came out of an environment in which a lot of fixed-income markets were highly illiquid and our system was developed with this in mind.

When a user plugs in a CUSIP or an ISIN the bond may not have any quotes. Our system is designed to identify ‘tangential’ bonds. For instance, an RMBS bond will have multiple tranches, and if there are no quotes on, say, the double-A notes, our system makes it easy to see all the other notes in the same deal.

With corporates, the system does the same for other bonds from the same issuer, or from other issuers with the same credit rating and maturity band. It allows users to form a solid opinion of where a bond ‘should’ trade based on comparables.

Does the system differentiate between executed and not executed prices, for instance by weighting? What is the thinking behind this approach?

Generally speaking, the parser tries to extract other meaningful metadata from messages that are related to quotes. It is not always clear if a published price is indicative or an executable level. But if somebody publishes a price with a size attached to it, this provides more confidence that it is likely to be an executable level. It’s the same with a published two-way market with a bid size and an offer size.

The other thing the parser will extract from messages is an axe flag. If the dealer is working on a trade on behalf of a client, clients will know they are not just ‘blowing smoke’ – there is a real interest the dealer is trying to find a match for.

Ultimately, clients know who is just spraying out prices versus who is legit. By giving them the information in one framework – where they don’t have to search through messages to figure out who is saying what – it is easy for them to form a quick opinion of real market value and the likelihood of execution, based on who is putting out information as well as the various signals and metadata.

This is presumably where it is important that a user’s own data is not anonymised – it only becomes anonymised when and if it goes into the crowdsourced system.

This is exactly right. However, when the data goes into the crowdsource data set and gets anonymised, the associated metadata still carries over. Even for an anonymised quote, users will be able to see if a size or an axe flag was attached to the message.

The anonymous quotes are also tiered based on the provider’s size in a particular market, which provides additional transparency and confidence.

Going back to the sell side, do you believe your platform can help dealers pull together rate sheets in a more systematic way?

It gives dealers confidence. As I said, the buy side knows who is providing them with good information. Anecdotally, buy-side users of our system have on occasion specifically asked for the ability to exclude certain dealers just because they feel that the information that is being provided is meaningless. It is important for dealers to provide information that the buy side finds valuable and meaningful.

The more information the sell side is consuming, the better the outcome they will be able to feed back to their buy-side clients and the more confidence they can have that they are on top of the market.

“We are dealing with illiquid markets and a large portion of the universe of securities does not get quoted. If a user needs to have an idea of where a specific bond should price, we have to put a lot of time into the price formation – trying to identify comps is a process, and it is time consuming and risky.”

The next step for SOLVE is rolling out its AI predictive pricing model. What additional benefits is this designed to add and what is the plan for rollout? When might it become relevant to Australian users?

The AI predictive price is a new product, designed to predict where the next trade is going to occur, specific to the side and size of the trade. We anticipate that it will work incredibly well side by side with our existing quotes product. What the AI is really good at is filling in the blanks around available data.

The quotes product takes granular, individual quotes we are observing in the market, digitises them and makes them available. This is extremely valuable, because it gives a sense of the depth of the market: how many dealers are quoting it and the range of bids and offers. This is very meaningful to a trader.

At the same time, we know we are dealing with illiquid markets and a large portion of the universe of securities does not get quoted. If a user needs to have an idea of where a specific bond should price, we have to put a lot of time into the price formation – trying to identify comps is a process, and it is time-consuming and risky.

A perfect formulation needs trades, quotes and reference data. The latter help establish comparables. Trades are super meaningful, because this is the goal the AI is training toward. Quotes are also incredibly meaningful, because many bonds don’t trade but they are still quoted – so quotes act as a kind of predictor of trades that are going to happen.

All three of these prerequisites exist in the first asset class in which we have rolled out the AI – municipal bonds – and the next one we are planning, corporate bonds. The AI is already doing a phenomenal job in the munis sector: our prediction error in the sector is in the low-to-mid 20s basis points in price, which is about mid-to-high single-digit basis points in yield. While this product is still in beta, we continue experimenting with different formulations of the AI model that continue to drive down the median error as well as the tail.

It is pricing almost the entire muni space, liquid and illiquid – so about 900,000 securities. To put this in context, only about 150,000 of these have any recent trade or quote data points.

The goal is to roll this out for every one of our asset classes and every one of the geographies in which we operate. The challenge will be making sure we have all three of the datasets. They are not all available in every asset class and geography, though – without giving too much away – we are looking at some alternatives that we believe will allow us to do a really good job of training the AI even in markets where trade data is not publicly available.