Turnover, account value and diversification of real traders: evidence of collective portfolio optimizing behavior

Despite the availability of very detailed data on financial market, agent-based modeling is hindered by the lack of information about real trader behavior. This makes it impossible to validate agent-based models, which are thus reverse-engineering attempts. This work is a contribution to the building of a set of stylized facts about the traders themselves. Using the client database of Swissquote Bank SA, the largest on-line Swiss broker, we find empirical relationships between turnover, account values and the number of assets in which a trader is invested. A theory based on simple mean-variance portfolio optimization that crucially includes variable transaction costs is able to reproduce faithfully the observed behaviors. We finally argue that our results bring into light the collective ability of a population to construct a mean-variance portfolio that takes into account the structure of transaction costs


Introduction
The availability of large data sets on financial markets is one of the main reasons behind the number and variety of works devoted to their analysis in various fields, and especially so in Econophysics since physicists much prefer to deal with very large data sets. At the macroscopic level, the analysis of millions of tick-by-tick data points uncovered striking regularities of price, volume, volatility, and order book dynamics (see [28,7,19,5] for reviews). Since these phenomena are caused by the behavior of individual traders, news, and the interplay between the two, finding a microscopic mechanism that allows agent-based models to reproduce some of these stylized facts is an important endeavor meant to give us insight on the causes for large fluctuations, be it herding [18], competition for predictability [13], portfolio optimization leading to market instability [30], or chaotic transitions [9].
Market phenomenology appears as a typical example of collective phenomena to the eyes of statistical physicists. Thus, the temptation to regard the numerous power-laws found in empirical works as signatures of criticality is intense. But if the former are really due to a phase transition, one wishes at least to know what the phases are, which is hard to guess from the data alone. According to early herding theoretical models [18], the phase transition may lie in the density of social communication and imitation, and is of percolation type, thereby linking power-law distributed price and volume, criticality and agent-behavior. The standard Minority Game [15] has also a single phase transition point where market predictability is entirely removed by the agents, without any specular effect on price and volume; on the other hand, grand-canonical MGs [33,25,11,12] that allow the agents not to play have a semi-line of critical points that do produce stylized facts of price, volume and volatility dynamics; in the framework of statistical physics, the phase transition is due to symmetry breaking, i.e., it is a transition between predictable and perfectly efficient markets; this also suggests that the emergence of large fluctuations is due to market efficiency.
There are of course many other possible origins of power-laws in financial markets that have nothing to do with a second order phase transition. The simplest mechanism is to consider multiplicative random walks with a reflecting boundary [29]. Long-range memory of volatility is well-reproduced in agent-based models whose agents act or do nothing depending on a criterion based on a random walk [6]. Assuming pre-existing power-law distributed wealth, an effective theory of market phenomenology links the distributions of price returns, volume, and trader wealth [23]. On the other hand, markets are able to produce power-law distributed price returns by simple mechanisms of limit order placement and removal without the need for wealth inequality [14,22]. However, in turn, one needs to explain why limit orders are placed in such manner; the heterogeneity of time scales may provide an explanation of order placement far away from best prices if power-law distributed [26], but additional work is needed in order to explain order placement near best prices, which causes these large price Description of the data 3 moves. Finally, a recent simple model of investment with leverage is able to reproduce some stylized facts [36].
But mechanisms alone may not be sufficient to replicate the full complexity of financial markets, as some part of it may lie instead in the heterogeneity of the agents themselves. While the need for heterogeneous agents in this context is intuitive (see e.g. [2]), there is no easily available data against which to test or to validate microscopically an agent-based model. Even if it is relatively easy to design agent-based models that reproduce some of the stylized facts of financial markets (see e.g [27,10,9,13,1]), one never knows if this is achieved for good reasons, except for volatility clustering [6]: it is to be expected that real traders behave sometimes at odds with one's intuition. Thus, without data about the traders themselves, one is left with the often frustrating and time-consuming task of reverse-engineering the market in order to determine the good ingredients indirectly. Some progresses have been made recently with the analysis of transactions in Spanish stock market aggregated by brokers [37], hence with mesoscale resolution.
Data on trader behavior is found in the files of brokers, usually shrouded in secrecy. But this lack of data accessibility is not entirely to blame for the current ignorance of real-trader dynamics: researchers, even when given access to broker data, have focused on trading gains and behavioral biases, often with factor-based analyses (see e.g. [3,4,20]).
We aim at providing a coherent picture of how various types of traders behave and interact, making it possible for agent-based models to rest on a much more solid basis. This paper is the first of a series that will establish stylized facts about trader characteristics and behavior. One of the most important aspects of these papers will be to characterize the heterogeneity of the traders in all respects (account value, turnover, trading frequency, behavioral biases, etc.) and the relationships between these quantities in probability distribution, not with factors. This paper is first devoted to the description of the large data set that we use; it then focuses on the relationship between trader account value, turnover per transaction and transaction costs, both empirically and theoretically. We will show that while the traders have a spontaneous tendency to build equally-weighted portfolios, the number of stocks in a portfolio increases non-linearly with their account value, which we link to portfolio optimisation and broker transaction fee structure.

Description of the data
Our data are extracted from the database of the largest Swiss on-line broker, Swissquote Bank SA (further referred to as Swissquote). The sample contains comprehensive details about all the 19 million electronic orders sent by 120'000 professional and non-professional on-line traders from January 2003 to March 2009. Of these orders, 65% have been canceled or have expired and 30% have been filled; the remaining 5% percent were still valid as of the 31st of March 2009. Since this study focuses on turnover as a function of account value, we chose to exclude orders for products that allow traders to invest more than their account value, also called leveraging, i.e., orders to margin-calls markets such as the foreign exchange market (FOREX) and the derivative exchange EUREX. The resulting sample contains 50% of orders for derivatives, 40% for stocks, and 4% for bonds and funds. Finally, 70% of these orders were sent to the Swiss market, 20% to the German market and about 10% to the US market.
Swissquote clients consist of three main groups: individuals, companies, and asset managers. Individual traders, also referred to as retail clients, are mainly non-professional traders acting for their own account. The accounts of companies are usually managed by individuals trading on behalf of a company and, as we shall see, behave very much like retail clients, albeit with a larger typical account value. Finally, asset managers manage accounts of individuals and/or companies, some of them dealing with more than a thousand clients; their behavior differ markedly from that of the other two categories of clients.

Account values
Numerous studies have been devoted to the analysis and modeling of wealth dynamics and distribution among a population (see [40] and references therein). The general picture is that in a population, a very large majority lies in the exponential part of the reciprocal cumulative distribution function, while the wealth of the richest people is Pareto-distributed, i.e., according to a power-law.
The account value of Swissquote traders is by definition the sum of all their assets (cash, stock, bonds, derivatives, funds, deposits), and denoted by P v . In order to simplify our analysis, we compute P v once per day after US markets close and take this value as a proxy for the next day's account value. Figure 1 displays this distribution computed at the time of the first and last transactions of the clients. Results are shown for the three main categories of clients. Maximum likelihood fits to the tail of the individual traders to the Pareto model p(x) ∼ (x/x min ) −γ were performed using the BC a bootstrap method of [21] and determining the parameter x min by minimizing the Kolmogorov-Smirnov statistics as in [16]. Results are reported in table 1.
The values of γ are in line with the wealth distribution of all major capitalistic countries (see [34] for a possible origin of Pareto exponents between 2.3 and 2.5). Thus the retail clients Results 5 Figure 1: Reciprocal cumulative distribution function of the portfolio value P v for the three categories of clients at the time of their first (empty symbols) and last (filled symbols) transactions. Several models have been fitted to the data by Maximum Likelihood Estimation (MLE): the Student distribution (Pareto with plateau), the Weibull (stretched exponential), and the log-normal distribution. The best candidate, determined graphically and via bootstrapping the Kolmogorov Smirnov test [16] was found to be the log-normal distribution, which is the only one shown here for the sake of clarity. The dashed line in light blue results from a MLE fit to the tail of the individual traders with the Pareto distribution p(x) ∼ (x/x min ) −γ (see section 3.1).   [31]. Consequently, figure 1 also reports a fit of the data to log-normal distributions ln N (µ, σ 2 ), which approximate more faithfully P > (P v ) than the Student and the Weibull distributions for the three categories of clients, except its extreme tail in the case of retail clients.

Mean turnover
The turnover of a single transaction i, denoted by T i is defined as the price paid times the volume of the transaction and does not include transaction fees. We have excluded the traders that have leveraged positions on stocks, hence T i ≤ P v ; more generally one wishes to determine how the average turnover of a given trader relates to his portfolio value. In passing, since P (P v ) has fat tails, the only way the distribution of T can avoid having fat tails is if the typical turnover is proportional to log(P v ). We denote by T the mean turnover per transaction for a given client over the history of his activities. Figure 2 reports its reciprocal cumulative distributions functions (RCDF) for stocks and derivatives for the three categories of clients; all RCDFs have a first plateau and then a fat tail. For stocks, the tails are not a pure power laws, but they are for derivatives. Indeed, fitting the RCDFs with Weibull, log-normal and Zipf-Mandelbrot distribution with an exponential cut-off, defined as clearly shows that the latter is the only one that does not systematically underestimate the tail of the RCDF for stocks; estimated values of β and γ given in table 3. The RCDFs related to the turnover of transactions on derivative products have clearer powerlaw tails for retail clients, which we fitted with a standard Zipf-Mandelbrot function, defined as F The parameters estimated are to be found in table 3; because of the power-law nature of this tail, fits with Weibull and log-normal distributions are not very good in the tails. While the decision process that allocates a budget to each type of product may be essentially the same, the buying power is larger for derivative products, which may explain the absence of a cut-off. Fits for companies and asset managers is very difficult and mostly non-conclusive because of unsufficient sample size; the good quality of the tail collapse (see inset) tends to indicate that the three distributions are identical, but we could not fit the RCDF of companies and asset managers with (2); as reported in figure 2b, log-normal distributions are adequate choices in these cases; since the quality of the fits are poor, we do not report the resulting parameters.

Mean turnover vs account value
The relationship between T vs P v is important as it dictates what fraction of their investable wealth the traders exchange in markets. We first produce a scatter plot of log T vs log P v (figure 4). In a log-log scale plot, it shows a cloud of points that is roughly increasing. A density plot is however clearer for retail clients as there are many more points (figure 3).
These plots make it clear that there are simple relationships between log T and log P v . A robust non-parametric regression method [17] reveals a double linear relationship between log T and log P v for all three categories of investors (see figures 4 and 3): where x = 1 when log P v < Θ 1 and x = 2 when log P v > Θ 2 . Fitted values with confidence intervals are reported in table 5.
This result is remarkable in two respects: (i) the double linear relation, not obvious to the naked eye, separates investors into two groups (ii) the ranges of values where the transition occurs is very similar across the three categories of traders.
The relationships above only applies to averages over all the agents. This means that there are some intrinsic quantities that make all the agents deviate from this average line. Detailed examination of the regression residuals show that the latter are for the most part (i.e. more where T i and P i v are respectively the turnover and portfolio value of investor i, and δ i a x are i.i.d. N (0, ξ 2 x ) idiosyncratic variations independent from P v that mirror the heterogeneity of the agents. As we shall see, portfolio optimization with heterogeneous parameters yields this precise relationship.

Turnover rescaled by account value
Let us now measure the typical fraction of wealth exchanged in a single transaction, defined as Q = T Pv . Since the inverse of this ratio is an indirect (and imperfect) proxy of the number N of assets that a trader owns, it also indicates how well diversified his investments are, hence, it can be viewed a simple proxy of the risk profiles of the agents. Figure 5 shows that the distributions look exponential to a naked eye for about 90% of the individuals and nearly 80% of the companies, while that of the asset managers is rapidly more complex that a simple exponential. We derive exact relationships for this quantity in subsection 3.4.2 that show that these distributions are in fact not exponential but log-normal.

data
The resulting picture is that only a small fraction of customers trade a large fraction of their wealth on average. Interestingly, these figures show a clear difference between the three categories of clients. As discussed above, figure 5 roughly reflects the risk profile of the different types of customers: less than 10% of asset managers trade on average more than 20% of their clients' capital in a single transaction; this rises to 30% for companies, and 45% for retail clients. Note however that despite the fact that the account values of companies and asset managers are comparable, companies tend to have a Q closer to that of the individuals; this suggests either that companies hold a smaller N than asset managers for the same account value, or that asset managers tend to make smaller adjustments to the quantities of assets.

theory
Since we know the distributions of T , P v and their relationship, we are in a position to derive analytical expressions for Q i = T (t) Pv(t) of investor i. The distribution of Q across the population of on-line investors can be easily found using (4) and the distribution of P v . Let P T,Pv (t, p v ) denote the joint distribution of T and P v : Let us now assume for the sake of clarity that T = e a+δa P β v . Given P v , the turnover T follows a log-normal distribution with mean log p v + a and variance ξ 2 . Substituting P T |Pv (t|p v ) = ln N log p v + a, ξ 2 in (5) leads after some simplifications to and where erfc(x) = 2 √ π´∞ x e −y 2 dy is the complementary error function. As expected, when β = 0 (i.e. T and P v are independent), we recover the product of the two marginal distributions. On the other hand, when β = 1, i.e., when T is proportional to P v , P Q (q) = ln N a, ξ 2 , which is the distribution of the factor e a+δa . For other values of β the functions P Q and F Q cannot be determined analytically unless P Pv takes a particular form as shown below. However, the moments of P Q (q) can be arranged in a simpler form: that is, the (log-normal) moments of T /P v times an integral term smaller or equal to 1 (because in practice P Pv (p v ) > 1) 1 . Hence, the relation E(q n ) ≤ e na+ 1 2 n 2 ξ 2 with equality when β = 1 holds for any distribution of the account value P v .
In section 3.1, we have shown that the distribution of P v is well-approximated by a lognormal distribution. This particular choice of distribution makes the previous integrals analytically tractable. Indeed, with P Pv = ln N (µ, σ 2 ) straight integration of (6) leads to P Q = ln N (M, S 2 ), where M = a − (1 − β)µ and S 2 = ξ 2 + (1 − β) 2 σ 2 . This simple result has some practical interest: given the distribution parameters and the coupling factor β, one can draw realistic q factors for agent-based modeling as Q = e M +SX , where X is N (0, 1) distributed. Furthermore, in the next section, we show how the value of β may be inferred from the transaction cost structure, which decreases the number of parameters to four. Figure 5 confirms the validity of the above theoretical results, once expanded to the case of a bi-linear relation between T and P v . It is noteworthy that the continuous lines are no fits on empirical q factors, but use instead the results of the separate fits on the turnover and account distributions.

The influence of transaction costs on trading behavior: optimal mean-variance portfolios
Apart from risk profiles, education, and typical wealth, the differences in the turnover as a function of wealth observed above between the three populations of traders may also lie in the 1 Mathematically, all the moments of Q always exist since β ≤ 1 and Pv(pv) must decay faster than p −1 v to be a valid distribution.
The influence of transaction costs on trading behavior: optimal mean-variance portfolios 13 difference of their actual transaction cost structure. Swissquote current standard structure for the Swiss market (its shape is very similar for European and US markets) is shown in figure 6; it is a piece-wise constant, non-linear looking function. Fitting all segments to equation 10 gives δ = 0.63 ∈ [0.5, 0.74] 95 . The fee structure of most brokers is not set in stone and can be negotiated. A frequent request is to have a flat fee, i.e. a fixed cost per transaction corresponding to a constant function. Since quite clearly the negotiation power of large clients or of clients that carry out many transactions is more important, asset managers are more likely to obtain a more favorable fee structure than basic retail clients.
Since buying some shares of an asset is the result of unconscious or calculated portfolio construction process, one first needs a theoretical reference point with which to compare the population characteristics as measured in the previous subsection. In other words, we shall use results from portfolio optimization theory with non-linear transaction cost functions to understand the results of the previous subsection.
Quite curiously, all analytical papers in the literature on optimal portfolios either neglect transaction costs or assume constant or linear transaction cost structures; non-linear structures are tackled numerically; thus, we incorporate the specific non-linear transaction cost structure faced by the traders under investigation in the classic one-shot portfolio optimization problem studied by Brennan [8], who restricted its discussion to fees proportional to the number of securities, in other words, a flat fee per transaction.
Building optimal mean-variance stock portfolios consists for a given agent in selecting which stock to invest in and in what proportion by maximizing the expected portfolio growth, usually called return, while trying to reduce the resulting a priori risk. One cost function that corresponds to such requirements is where R is the stochastic return of the portfolio over the investment horizon (e.g., one month, one year) and λ tunes the trade-off between risk and return; as such, it can be interpreted as a measure of an investor's attitude towards risk: the larger λ , the more risk-adverse the investor.
The return of the portfolio can be decomposed into contributions from risky assets (stocks, derivatives, etc.), the interests of the amount kept in cash, and the total relative cost of broker commission, which we denote as R = R risky + R cash − R cost . Mathematically, where R i is the return of stock i over this horizon, x i is the fraction of the total wealth invested in this stock, and N is the total number of investable assets; we shall denote the total fraction of wealth invested in risky assets by where r is the interest rate; (1 + r), where F (x) is the amount charged by a broker to exchange an amount x of cash into shares or vice-versa.
The focus of this section is to derive explicit relationships between F , the number of assets to hold in a portfolio, and the account value P v . Whereas previous works only considered special cases for F that are not compatible with the fees structure of Swissquote, we need to introduce a cost function that can accommodate all the standard broker commission schemes. The two extreme cases are i) flat-fee per transaction, i.e., a fixed cost that does not depend on the amount exchanged ii) a proportional scheme, possibly with a maximum fee. Swissquote's standard scheme stands in between and is well approximated by a power-law with a maximum fee F max . We hence choose where δ interpolates between a flat-fee (δ = 0), as in [8], and a proportional scheme (δ = 1) via a power-law, and C is a constant.
Following the well-known one-factor model of Sharpe [32], we assume that the return of asset i follows the global market's return R M with an idiosyncratic proportionality factor β i . More The influence of transaction costs on trading behavior: optimal mean-variance portfolios 15 specifically, where ε i is an uncorrelated white noise E(ε i ) = E(ε i ε j ) = E(R M ε i ) = 0. This equation means that the systematic idiosyncratic part of R i only applies to the return above the risk-free interest rate, also called market risk premium.
This completely specifies the functional L λ . Returning to (9), one first computes the expectation and variance of the portfolio return: and Note that, since here the risk-free rate is non-random, the portfolio variance is independent of both the risk-free investment and broker commission; this does not hold for the expected return.
In principle, the functional L depends on N , the number of assets in the portfolio, λ the risk parameter, and x i the fraction of account value to invest in risky product i. Assuming that x i is constant for all i (i.e. equally-weighted allocation), we are left with only three parameters since x i = x/N . Thus, from the optimization of the resulting functional one can obtain a relationship between any two of these parameters. We are mostly interested in N as a function of x.

Non-linear relationship between account value and number of assets
We will first assume that agents seek the optimal fraction of their account value x * to invest in N securities-N being known-given the risk free rate r and broker commission F (x i W ). The optimal solution is simply obtained by setting x i = x/N in (12) and (13), and by equating to zero the derivative of (9) with respect to x. This leads to the following transcendental equation for x * : Traders' collective portfolio optimization with transaction costs Var(ε i ) is the mean idiosyncratic volatility. Provided the investor risk tolerance λ has been reliably estimated, which is usually a complex task [39], and that Sharpe model is adequate, (14) can be used directly in a real-world portfolio optimization problem. The β i and ε i are then obtained by regressing the returns of all the stocks with (11); the optimal solution is expected to be reliable in the absence of significant residual correlations between ε i and ε j . In the more common situation where λ is unknown, one can derive a second equation for the optimal number of securities under the assumption that portfolios are sufficiently homogeneous, or that the investment horizon is long enough so as to haveβ and Var(ε) independent from N . As shown in figure 7,β on the US stock market is persistently close to one for various time horizons and values of N , consistently with the homogeneous assumption. Taking a few technical precautions into account ( [8]), the differentiation of the Lagrangian (9) with respect to N leads to where it is assumed that δ < 1 since for δ = 1 the optimum investment does not depend on N through the cost function. According to (15), the agent risk tolerance increases with their account value P v , in agreement with various survey studies on the risk tolerance of actual investors (see the literature review of [38]). Using (14) and (15) to get rid of λ, we obtain where K is the ratio of residual risk to market risk defined as .
Given the desired level of systematic risk x, (16) can be solved for N numerically in an actual portfolio optimization. Further insight is gained by considering the high diversification limit N 1, which yields 1 + δ (16) and thus where K is given by the right-hand side of (17). The latter equation generalizes [8] to the case of a varying cost impact represented here by the parameter δ (i.e. the result of [8] is recovered by setting δ = 0 and β i = 1 in (18)). These results can be further generalized to non-equally weighted portfolios by differentiating (9) with respect to x i and assuming again an homogeneous condition for the β i s.
In essence, (18) says that the number of securities held in an equally-weighted mean-variance Turnover, number of assets and account value 17 portfolio with Sharpe-like returns is related to the amount invested as in the high diversification limit, where κ is the pre-factor of (xP v ) (18). The last equation gives N as a function of P v for a predefined x in the optimal portfolio. The heterogeneity of the traders, beyond their account value, is not apparent yet, but may occur both in x and κ: first each trader may have his own preference regarding the fraction of this account to invest in risky assets, x; therefore one should replace x by x i ; next, κ includes both a term related to transaction costs, which does vary from trader to trader, and some measures and expectation of market returns and variance; each trader may have his own perception or way of measuring them, hence κ should also be replaced by κ i . Finally, both terms can be merged in the same constant term ζ i = 1−δ 2−δ log(x i ) + κ i . This explains how the heterogeneity of the traders is the cause of fluctuations in the kind of relationships we are interested in.

Turnover, number of assets and account value
The result above only links N with P v , but one also wishes to obtain relationships that involve the turnover per transaction, T . Whereas in section 3, we have characterized the turnover of any transaction, the results of section 4 rest on the assumption that the agents build their portfolio by selecting a group of assets and stick to them over a period of time. This, obviously, does not include the possibility of speculating by a series of buy and sell trades on even a single asset, nor portfolio rebalancing which consists in adjusting the relative proportions of some assets. We thus have to find a way to differentiate between portfolio building, rebalancing and speculation. Here, we shall focus on portfolio building in order to test and link the results of section 4 to those of section 3.
We have found a simple effective method that can separate portfolio-building transactions from the other ones: we assume that the transactions of trader i that correspond to the building of his portfolio are restricted the first transaction of assets not traded previously; sell orders are ignored, since Swissquote clients cannot short sell easily. In other words, if trader i owns some shares of assets A, B, and C and then buys some shares of asset D, the corresponding transaction is deemed to contribute to his portfolio building process; the set of such transactions is denoted by Φ i , while the full set of transactions is denoted by Ω i . Any subsequent transaction of shares of assets A, B, C, or D are left out of Φ i . The number of different assets that trader i owns is supposed to be N i |Φ i | where |X| is the cardinal of set X; this approach assumes that a trader always owns shares in all the assets ever traded; surprisingly, this is by large the most common case. We shall drop the index i from now on.  Let us now focus on T Φ = k∈Φ T k , the total turnover that helped building his portfolio. We should first check how it is related to the total portfolio value P v . Let us define P v Φ , the account value of a trader averaged at the times at which he trades a new asset.Plotting log P v Φ against log T Φ gives a cloudy relationship, as usual, but the fitting it with log P v Φ = χ log T Φ gives χ = 1.03 ± 0.02 for individuals, χ = 0.99 ± 0.02 for asset managers and χ = 1.00 ± 0.01 for companies with an adjusted R 2 = 0.99 in all cases. This relationship trivially holds for the traders who buy all their assets at once, as assumed in the portfolio model. The traders who do not lie on this line either hold positions in cash (in which case this line is a lower bound), or do not build their portfolio in a single day: they pile up positions in derivative products or stocks whose price fluctuations are the origin of the devations from the line. But the fact that the slope is close to 1 means that the average fluctuation is zero, hence, that on average trades do not make money from the positions taken on new stocks. The consequence of this is that log P v can be replaced by log T Φ in (19), thus, setting x = 1, The x = 1 assumption is in fact quite reasonable: most Swissquote traders do not use their trading account as savings accounts and are fully invested; we do not know what amount they keep on their other bank accounts.
A robust non-parametric fit does reveal a linear relationship between log N and log T Φ in a  given region (N, T Φ ) ∈ Γ (figure 8). In this region, we have which gives We still need to link T Φ and P v β Φ . While section 3 showed that the unconditional averages lead to T ∼ P v β , one also finds that T Φ ∼ P v β Φ . Therefore, one can write Thus, one is finally rewarded with the missing link which directly involves the transaction cost structure in the relationship between turnover and portfolio value, as argued in section 3 2 . This relationship allows us to close the loop as we are now able to relate directly the exponents linking T , N , and P v . Going back to section 3, one understands that the existence of a bi-linear relationship between log-turnover and log-account value, i.e., of two values of β for each of the three categories of clients, is linked to two values of δ: a flat flee structure or the disregard for transaction costs leads to β = 1 2 , while proportional fees (δ = 1) give β = 1.
Let us finally discuss the empirical values of α, β, and δ against their theoretical counterparts, which is summarized in table 8.
1. Small values of T Φ : it was impossible to measure α in that case since the non-parametric fit shows a non-linear relationship in the log-log plot for retail clients, which we trust more since they have many many more points than the graphs for the two other categories of clients. But it may not make sense to expect a linear relationship since such a relationship is only expected for N large enough (N ≥ 10 in practice) and a small T Φ is related to a small N . Thus we can only test β = 1/(2 − δ). The reported value of β is consistent accross all the clients. Retail clients have a larger δ ef f = 2 − 1 β that the estimated δ SQ . Since the shape of the fee structure is discontinuous, the values of these exponents can hardly be expected to match. However, fitting the whole curve structure may be problematic in this context: indeed, the traders with a typical small value of T Φ see a more linear relationship in the region of small transaction value that when considering the whole curve; for instance, removing the two largest segments from the fee structure yields δ SQ = 0.74 ∈ [0.43, 0.79], which is not far of δ ef f .

2.
Large values of T Φ : the relationships between all the exponents are verified for the three categories of clients. While not very impressive for companies and asset managers, this result is much stronger in the case of retail clients since the relative uncertainties associated with each measured exponent are small (1-2%). The value of β retail is of particular interest as it corresponds δ ef f = 0, or equivalently, to a flat fee structure. Going back to the fees structure of Swissquote, one finds that that the transition happens when the relative transaction cost falls below some threshold (we cannot give its precise value for confidentiality reasons; it is smaller than 1%). A possible explanation is that either some traders with a high enough average turnover have a flat-fee agreement with Swissquote and that the rest of them simply act as if they were not able to take correctly into account transaction costs. Since not all traders have a flat-fee aggrement, one must conclude that some traders have indeed some problems estimating small relative fees and simply disregard them. The reported value of β for companies and asset managers is larger that β retail , but it is more likely than not that the small sample size is responsible for this discrepancy, since these two categories of clients have a greater propensity to negociate a flat-fee structure.
3. Transition between the two regimes: the transitions between the standard Swissquote and an effective flat-fee structure happens occur at the same average value of T for the three categories of traders (idem for T Φ ). Since there is no automatic switching between fee structures at Swissquote for any predefined value of transaction value, one is lead to conclude that this transition has behavioural origins, which is also responsible for the value at which the transition takes place which, in passing, corresponds to the end of the plateau of the RCDF of P v in the case of retail clients (e 15 3.27 · 10 6 ). As a consequence, it is likely that the traders tend to either neglect or consider as constant transaction fees smaller than some threshold when they build their portfolio.

Discussion and outlook
We have been able to determine empirically a bilinear relationship between the average logturnover and the average log-account value and have argued that it comes from the transaction fee structure of the broker and its perception by the agents. A theoretical derivation of optimal simple one-shot mean-variance portfolios with non-linear transaction costs predicted relationships between turnover, number of different asset in the portfolio and log-account values that could be verified empirically. This means that the populations of traders do take correctly on average, i.e. collectively, the transaction costs into account and act collectively as mean-variance equally-weighted portfolio optimizers. This is not to say that each trader is a mean-variance optimizer, but that the population taken as a whole behaves as such-with differences across populations, as discussed in the previous section. This to be related to findings of Kirman's famous work on demand and offer average curves in Marseille's fish market [24] and more generally as what has become known as the wisdom of the crowds (see [35] for an easy-to-read account).
The fact that the turnover depends in a non-linear way on the account value implies that linking the exponents of the distributions of transaction volume, buying power of large players in financial markets, and price return is more complex that previously thought [23]. It has also implications for agent-based models, which from now on must take into account the fact that the real traders do invest into a number of assets that depends non-linearly on their wealth.
Future research will address the relationship between account value and trading frequency, which is of utmost importance to understand if the many small trades of small investors have a comparable influence on financial market than those of institutional investors. This will give an understanding of whom provides liquidity and what all the non-linear relationships found above mean in this respect. This is also crucial in agent-based models, in which one often imposes such relationship by hand, arbitrarily; reversely, one will be able to validate evolutionary mechanisms of agent-based model according to the relationship between trading frequency, turnover, number of assets and account value they achieve in their steady state.