How do I overfit?Classification probability thresholdRandom Forest can't overfit?Doesn't Factor Analysis always overfit on a theoretical basisDiscussion about overfit in xgboostWhy is boosting less likely to overfit?Decision tree does not overfit, why?Cannot overfit on the IRIS datasetNeural Network: Why can't I overfit?On which datasets does AdaBoost algorithm overfit?Why don't GAN generators vastly overfit?Can't get a Keras model to overfit

If I readied a spell with the trigger "When I take damage", do I have to make a constitution saving throw to avoid losing Concentration?

Why are prions in animal diets not destroyed by the digestive system?

Independent, post-Brexit Scotland - would there be a hard border with England?

Why isn't nylon as strong as kevlar?

How can I get a job without pushing my family's income into a higher tax bracket?

Understanding trademark infringements in a world where many dictionary words are trademarks?

What property of a BJT transistor makes it an amplifier?

How long would it take for people to notice a mass disappearance?

Can a nothic's Weird Insight action discover secrets about a player character that the character doesn't know about themselves?

Would Hubble Space Telescope improve black hole image observed by EHT if it joined array of telesopes?

Has a commercial or military jet bi-plane ever been manufactured?

Can there be a single technologically advanced nation, in a continent full of non-technologically advanced nations?

Out of scope work duties and resignation

I drew a randomly colored grid of points with tikz, how do I force it to remember the first grid from then on?

As matter approaches a black hole, does it speed up?

Why Isn’t SQL More Refactorable?

What was the design of the Macintosh II's MMU replacement?

Can my company stop me from working overtime?

What happens if you dump antimatter into a black hole?

Have I damaged my car by attempting to reverse with hand/park brake up?

Verb "geeitet" in an old scientific text

Does a card have a keyword if it has the same effect as said keyword?

Would the Disguise Self spell be able to reveal hidden birthmarks/tattoos (of the person they're disguised as) to a character?

BOOM! Perfect Clear for Mr. T



How do I overfit?


Classification probability thresholdRandom Forest can't overfit?Doesn't Factor Analysis always overfit on a theoretical basisDiscussion about overfit in xgboostWhy is boosting less likely to overfit?Decision tree does not overfit, why?Cannot overfit on the IRIS datasetNeural Network: Why can't I overfit?On which datasets does AdaBoost algorithm overfit?Why don't GAN generators vastly overfit?Can't get a Keras model to overfit






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


This is a weird question, I know.



I'm just a noob and trying to learn about different classifier options and how they work. So I'm asking the question:



Given a dataset of n1-dimensions and n2-observations where each observation can be classified into n3-buckets, which algorithm most efficiently (ideally with just one training iteration) produces a model (classification boundary) that would perfectly classify every observation in the dataset (completely overfit)?



In other words, how does one most easily overfit?



(Please don't lecture me on 'not overfitting'. This is just for theoretical educational purposes.)



I have a suspicion that the answer something like this: "Well, if your number of dimensions is greater than your number of observations, use X algorithm to draw the boundary(ies), otherwise use Y algorithm."



I also have a suspicion that the answer will say: "You can draw a smooth boundary, but that more computationally expensive than drawing straight lines between all differing classified observations."



But that's as far as my intuition will guide me. Can you help?



I have a hand-drawn example of what I think I'm talking about in 2D with binary classification.



Basically, just split the difference, right? What algorithm does this efficiently for n-dimensions?



Basically just split the difference, right? What algorithm does this efficiently?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    knn with $k=1$?
    $endgroup$
    – shimao
    5 hours ago










  • $begingroup$
    @shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
    $endgroup$
    – Legit Stack
    5 hours ago










  • $begingroup$
    @shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
    $endgroup$
    – Legit Stack
    5 hours ago






  • 2




    $begingroup$
    @shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
    $endgroup$
    – Stephan Kolassa
    5 hours ago

















2












$begingroup$


This is a weird question, I know.



I'm just a noob and trying to learn about different classifier options and how they work. So I'm asking the question:



Given a dataset of n1-dimensions and n2-observations where each observation can be classified into n3-buckets, which algorithm most efficiently (ideally with just one training iteration) produces a model (classification boundary) that would perfectly classify every observation in the dataset (completely overfit)?



In other words, how does one most easily overfit?



(Please don't lecture me on 'not overfitting'. This is just for theoretical educational purposes.)



I have a suspicion that the answer something like this: "Well, if your number of dimensions is greater than your number of observations, use X algorithm to draw the boundary(ies), otherwise use Y algorithm."



I also have a suspicion that the answer will say: "You can draw a smooth boundary, but that more computationally expensive than drawing straight lines between all differing classified observations."



But that's as far as my intuition will guide me. Can you help?



I have a hand-drawn example of what I think I'm talking about in 2D with binary classification.



Basically, just split the difference, right? What algorithm does this efficiently for n-dimensions?



Basically just split the difference, right? What algorithm does this efficiently?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    knn with $k=1$?
    $endgroup$
    – shimao
    5 hours ago










  • $begingroup$
    @shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
    $endgroup$
    – Legit Stack
    5 hours ago










  • $begingroup$
    @shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
    $endgroup$
    – Legit Stack
    5 hours ago






  • 2




    $begingroup$
    @shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
    $endgroup$
    – Stephan Kolassa
    5 hours ago













2












2








2





$begingroup$


This is a weird question, I know.



I'm just a noob and trying to learn about different classifier options and how they work. So I'm asking the question:



Given a dataset of n1-dimensions and n2-observations where each observation can be classified into n3-buckets, which algorithm most efficiently (ideally with just one training iteration) produces a model (classification boundary) that would perfectly classify every observation in the dataset (completely overfit)?



In other words, how does one most easily overfit?



(Please don't lecture me on 'not overfitting'. This is just for theoretical educational purposes.)



I have a suspicion that the answer something like this: "Well, if your number of dimensions is greater than your number of observations, use X algorithm to draw the boundary(ies), otherwise use Y algorithm."



I also have a suspicion that the answer will say: "You can draw a smooth boundary, but that more computationally expensive than drawing straight lines between all differing classified observations."



But that's as far as my intuition will guide me. Can you help?



I have a hand-drawn example of what I think I'm talking about in 2D with binary classification.



Basically, just split the difference, right? What algorithm does this efficiently for n-dimensions?



Basically just split the difference, right? What algorithm does this efficiently?










share|cite|improve this question











$endgroup$




This is a weird question, I know.



I'm just a noob and trying to learn about different classifier options and how they work. So I'm asking the question:



Given a dataset of n1-dimensions and n2-observations where each observation can be classified into n3-buckets, which algorithm most efficiently (ideally with just one training iteration) produces a model (classification boundary) that would perfectly classify every observation in the dataset (completely overfit)?



In other words, how does one most easily overfit?



(Please don't lecture me on 'not overfitting'. This is just for theoretical educational purposes.)



I have a suspicion that the answer something like this: "Well, if your number of dimensions is greater than your number of observations, use X algorithm to draw the boundary(ies), otherwise use Y algorithm."



I also have a suspicion that the answer will say: "You can draw a smooth boundary, but that more computationally expensive than drawing straight lines between all differing classified observations."



But that's as far as my intuition will guide me. Can you help?



I have a hand-drawn example of what I think I'm talking about in 2D with binary classification.



Basically, just split the difference, right? What algorithm does this efficiently for n-dimensions?



Basically just split the difference, right? What algorithm does this efficiently?







overfitting






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 1 hour ago









Peter Mortensen

20528




20528










asked 5 hours ago









Legit StackLegit Stack

1355




1355







  • 3




    $begingroup$
    knn with $k=1$?
    $endgroup$
    – shimao
    5 hours ago










  • $begingroup$
    @shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
    $endgroup$
    – Legit Stack
    5 hours ago










  • $begingroup$
    @shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
    $endgroup$
    – Legit Stack
    5 hours ago






  • 2




    $begingroup$
    @shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
    $endgroup$
    – Stephan Kolassa
    5 hours ago












  • 3




    $begingroup$
    knn with $k=1$?
    $endgroup$
    – shimao
    5 hours ago










  • $begingroup$
    @shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
    $endgroup$
    – Legit Stack
    5 hours ago










  • $begingroup$
    @shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
    $endgroup$
    – Legit Stack
    5 hours ago






  • 2




    $begingroup$
    @shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
    $endgroup$
    – Stephan Kolassa
    5 hours ago







3




3




$begingroup$
knn with $k=1$?
$endgroup$
– shimao
5 hours ago




$begingroup$
knn with $k=1$?
$endgroup$
– shimao
5 hours ago












$begingroup$
@shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
$endgroup$
– Legit Stack
5 hours ago




$begingroup$
@shimao I suppose that would work, wouldn't it? Yeah, I can't see why not. Thank you very much!
$endgroup$
– Legit Stack
5 hours ago












$begingroup$
@shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
$endgroup$
– Legit Stack
5 hours ago




$begingroup$
@shimao Is that the most efficient way to encode the Boundary? Probably right? Since we don't know if the data is entirely random or not, using the data itself as the encoded model with KNN algorithm is probably the best you can generally do. Right?
$endgroup$
– Legit Stack
5 hours ago




2




2




$begingroup$
@shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
$endgroup$
– Stephan Kolassa
5 hours ago




$begingroup$
@shimao: do you want to post your comment as an answer (perhaps with a little more detail)?
$endgroup$
– Stephan Kolassa
5 hours ago










2 Answers
2






active

oldest

votes


















3












$begingroup$

As long as all the observations are unique, then K-nearest neighbors with K set to 1 and with any arbitrary valid distance metric will give a classifier which perfectly fits the training set (since the nearest neighbor of every point in the training set is trivially, itself). And it's probably the most efficient since no training at all is needed.




Is that the most efficient way to encode the Boundary? Probably right?
Since we don't know if the data is entirely random or not, using the
data itself as the encoded model with KNN algorithm is probably the
best you can generally do. Right?




It's the most time-efficient, but not necessarily the most space efficient.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
    $endgroup$
    – Mooing Duck
    1 hour ago



















3












$begingroup$

You can't.



At least not in general, to the degree you want, if you want a perfect fit with arbitrary data and arbitrary dimensionality.



As an example, suppose we have $n_1=0$ predictor dimensions (i.e., none at all) and $n_2=2$ observations classified into $n_3=2$ buckets. The two observations are classified into two different buckets, namely "chocolate" and "vanilla".



Since you don't have any predictors, you will not be able to classify them perfectly, period.




If you have at least one predictor that takes different values on each observation, then you can indeed overfit arbitrarily badly, simply by using arbitrarily high polynomial orders for a numerical predictor (if the predictor is categorical with different values on each observation, you don't even need to transform). The tool or model is pretty much secondary. Yes, it's easy to overfit.



Here is an example. The 10 observations are completely independent of the single numerical predictor. We fit increasingly complex logistical regressions or powers of the predictor and classify using a threshold of 0.5 (which is not good practice). Correctly fitted points are marked in green, incorrectly fitted ones in red.



overfitting



R code:



nn <- 10
set.seed(2)

predictor <- runif(nn)
outcome <- runif(nn)>0.5

plot(predictor,outcome,pch=19,yaxt="n",ylim=c(-0.1,1.6))
axis(2,c(0,1),c("FALSE","TRUE"))

orders <- c(1,2,3,5,7,9)
xx <- seq(min(predictor),max(predictor),0.01)

par(mfrow=c(3,2))
for ( kk in seq_along(orders) ) ( fits_obs<0.5 & !outcome)
points(predictor[correct],outcome[correct],cex=1.4,col="green",pch="o")
points(predictor[!correct],outcome[!correct],cex=1.4,col="red",pch="o")






share|cite|improve this answer









$endgroup$













    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f406122%2fhow-do-i-overfit%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3












    $begingroup$

    As long as all the observations are unique, then K-nearest neighbors with K set to 1 and with any arbitrary valid distance metric will give a classifier which perfectly fits the training set (since the nearest neighbor of every point in the training set is trivially, itself). And it's probably the most efficient since no training at all is needed.




    Is that the most efficient way to encode the Boundary? Probably right?
    Since we don't know if the data is entirely random or not, using the
    data itself as the encoded model with KNN algorithm is probably the
    best you can generally do. Right?




    It's the most time-efficient, but not necessarily the most space efficient.






    share|cite|improve this answer









    $endgroup$












    • $begingroup$
      If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
      $endgroup$
      – Mooing Duck
      1 hour ago
















    3












    $begingroup$

    As long as all the observations are unique, then K-nearest neighbors with K set to 1 and with any arbitrary valid distance metric will give a classifier which perfectly fits the training set (since the nearest neighbor of every point in the training set is trivially, itself). And it's probably the most efficient since no training at all is needed.




    Is that the most efficient way to encode the Boundary? Probably right?
    Since we don't know if the data is entirely random or not, using the
    data itself as the encoded model with KNN algorithm is probably the
    best you can generally do. Right?




    It's the most time-efficient, but not necessarily the most space efficient.






    share|cite|improve this answer









    $endgroup$












    • $begingroup$
      If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
      $endgroup$
      – Mooing Duck
      1 hour ago














    3












    3








    3





    $begingroup$

    As long as all the observations are unique, then K-nearest neighbors with K set to 1 and with any arbitrary valid distance metric will give a classifier which perfectly fits the training set (since the nearest neighbor of every point in the training set is trivially, itself). And it's probably the most efficient since no training at all is needed.




    Is that the most efficient way to encode the Boundary? Probably right?
    Since we don't know if the data is entirely random or not, using the
    data itself as the encoded model with KNN algorithm is probably the
    best you can generally do. Right?




    It's the most time-efficient, but not necessarily the most space efficient.






    share|cite|improve this answer









    $endgroup$



    As long as all the observations are unique, then K-nearest neighbors with K set to 1 and with any arbitrary valid distance metric will give a classifier which perfectly fits the training set (since the nearest neighbor of every point in the training set is trivially, itself). And it's probably the most efficient since no training at all is needed.




    Is that the most efficient way to encode the Boundary? Probably right?
    Since we don't know if the data is entirely random or not, using the
    data itself as the encoded model with KNN algorithm is probably the
    best you can generally do. Right?




    It's the most time-efficient, but not necessarily the most space efficient.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered 4 hours ago









    shimaoshimao

    10k11636




    10k11636











    • $begingroup$
      If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
      $endgroup$
      – Mooing Duck
      1 hour ago

















    • $begingroup$
      If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
      $endgroup$
      – Mooing Duck
      1 hour ago
















    $begingroup$
    If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
    $endgroup$
    – Mooing Duck
    1 hour ago





    $begingroup$
    If you want space efficiency, store the hashcodes. Also note that you only need to analyze N-1 buckets.
    $endgroup$
    – Mooing Duck
    1 hour ago














    3












    $begingroup$

    You can't.



    At least not in general, to the degree you want, if you want a perfect fit with arbitrary data and arbitrary dimensionality.



    As an example, suppose we have $n_1=0$ predictor dimensions (i.e., none at all) and $n_2=2$ observations classified into $n_3=2$ buckets. The two observations are classified into two different buckets, namely "chocolate" and "vanilla".



    Since you don't have any predictors, you will not be able to classify them perfectly, period.




    If you have at least one predictor that takes different values on each observation, then you can indeed overfit arbitrarily badly, simply by using arbitrarily high polynomial orders for a numerical predictor (if the predictor is categorical with different values on each observation, you don't even need to transform). The tool or model is pretty much secondary. Yes, it's easy to overfit.



    Here is an example. The 10 observations are completely independent of the single numerical predictor. We fit increasingly complex logistical regressions or powers of the predictor and classify using a threshold of 0.5 (which is not good practice). Correctly fitted points are marked in green, incorrectly fitted ones in red.



    overfitting



    R code:



    nn <- 10
    set.seed(2)

    predictor <- runif(nn)
    outcome <- runif(nn)>0.5

    plot(predictor,outcome,pch=19,yaxt="n",ylim=c(-0.1,1.6))
    axis(2,c(0,1),c("FALSE","TRUE"))

    orders <- c(1,2,3,5,7,9)
    xx <- seq(min(predictor),max(predictor),0.01)

    par(mfrow=c(3,2))
    for ( kk in seq_along(orders) ) ( fits_obs<0.5 & !outcome)
    points(predictor[correct],outcome[correct],cex=1.4,col="green",pch="o")
    points(predictor[!correct],outcome[!correct],cex=1.4,col="red",pch="o")






    share|cite|improve this answer









    $endgroup$

















      3












      $begingroup$

      You can't.



      At least not in general, to the degree you want, if you want a perfect fit with arbitrary data and arbitrary dimensionality.



      As an example, suppose we have $n_1=0$ predictor dimensions (i.e., none at all) and $n_2=2$ observations classified into $n_3=2$ buckets. The two observations are classified into two different buckets, namely "chocolate" and "vanilla".



      Since you don't have any predictors, you will not be able to classify them perfectly, period.




      If you have at least one predictor that takes different values on each observation, then you can indeed overfit arbitrarily badly, simply by using arbitrarily high polynomial orders for a numerical predictor (if the predictor is categorical with different values on each observation, you don't even need to transform). The tool or model is pretty much secondary. Yes, it's easy to overfit.



      Here is an example. The 10 observations are completely independent of the single numerical predictor. We fit increasingly complex logistical regressions or powers of the predictor and classify using a threshold of 0.5 (which is not good practice). Correctly fitted points are marked in green, incorrectly fitted ones in red.



      overfitting



      R code:



      nn <- 10
      set.seed(2)

      predictor <- runif(nn)
      outcome <- runif(nn)>0.5

      plot(predictor,outcome,pch=19,yaxt="n",ylim=c(-0.1,1.6))
      axis(2,c(0,1),c("FALSE","TRUE"))

      orders <- c(1,2,3,5,7,9)
      xx <- seq(min(predictor),max(predictor),0.01)

      par(mfrow=c(3,2))
      for ( kk in seq_along(orders) ) ( fits_obs<0.5 & !outcome)
      points(predictor[correct],outcome[correct],cex=1.4,col="green",pch="o")
      points(predictor[!correct],outcome[!correct],cex=1.4,col="red",pch="o")






      share|cite|improve this answer









      $endgroup$















        3












        3








        3





        $begingroup$

        You can't.



        At least not in general, to the degree you want, if you want a perfect fit with arbitrary data and arbitrary dimensionality.



        As an example, suppose we have $n_1=0$ predictor dimensions (i.e., none at all) and $n_2=2$ observations classified into $n_3=2$ buckets. The two observations are classified into two different buckets, namely "chocolate" and "vanilla".



        Since you don't have any predictors, you will not be able to classify them perfectly, period.




        If you have at least one predictor that takes different values on each observation, then you can indeed overfit arbitrarily badly, simply by using arbitrarily high polynomial orders for a numerical predictor (if the predictor is categorical with different values on each observation, you don't even need to transform). The tool or model is pretty much secondary. Yes, it's easy to overfit.



        Here is an example. The 10 observations are completely independent of the single numerical predictor. We fit increasingly complex logistical regressions or powers of the predictor and classify using a threshold of 0.5 (which is not good practice). Correctly fitted points are marked in green, incorrectly fitted ones in red.



        overfitting



        R code:



        nn <- 10
        set.seed(2)

        predictor <- runif(nn)
        outcome <- runif(nn)>0.5

        plot(predictor,outcome,pch=19,yaxt="n",ylim=c(-0.1,1.6))
        axis(2,c(0,1),c("FALSE","TRUE"))

        orders <- c(1,2,3,5,7,9)
        xx <- seq(min(predictor),max(predictor),0.01)

        par(mfrow=c(3,2))
        for ( kk in seq_along(orders) ) ( fits_obs<0.5 & !outcome)
        points(predictor[correct],outcome[correct],cex=1.4,col="green",pch="o")
        points(predictor[!correct],outcome[!correct],cex=1.4,col="red",pch="o")






        share|cite|improve this answer









        $endgroup$



        You can't.



        At least not in general, to the degree you want, if you want a perfect fit with arbitrary data and arbitrary dimensionality.



        As an example, suppose we have $n_1=0$ predictor dimensions (i.e., none at all) and $n_2=2$ observations classified into $n_3=2$ buckets. The two observations are classified into two different buckets, namely "chocolate" and "vanilla".



        Since you don't have any predictors, you will not be able to classify them perfectly, period.




        If you have at least one predictor that takes different values on each observation, then you can indeed overfit arbitrarily badly, simply by using arbitrarily high polynomial orders for a numerical predictor (if the predictor is categorical with different values on each observation, you don't even need to transform). The tool or model is pretty much secondary. Yes, it's easy to overfit.



        Here is an example. The 10 observations are completely independent of the single numerical predictor. We fit increasingly complex logistical regressions or powers of the predictor and classify using a threshold of 0.5 (which is not good practice). Correctly fitted points are marked in green, incorrectly fitted ones in red.



        overfitting



        R code:



        nn <- 10
        set.seed(2)

        predictor <- runif(nn)
        outcome <- runif(nn)>0.5

        plot(predictor,outcome,pch=19,yaxt="n",ylim=c(-0.1,1.6))
        axis(2,c(0,1),c("FALSE","TRUE"))

        orders <- c(1,2,3,5,7,9)
        xx <- seq(min(predictor),max(predictor),0.01)

        par(mfrow=c(3,2))
        for ( kk in seq_along(orders) ) ( fits_obs<0.5 & !outcome)
        points(predictor[correct],outcome[correct],cex=1.4,col="green",pch="o")
        points(predictor[!correct],outcome[!correct],cex=1.4,col="red",pch="o")







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered 5 hours ago









        Stephan KolassaStephan Kolassa

        49.1k8103186




        49.1k8103186



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f406122%2fhow-do-i-overfit%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Log på Navigationsmenu

            Creating second map without labels using QGIS?How to lock map labels for inset map in Print Composer?How to Force the Showing of Labels of a Vector File in QGISQGIS Valmiera, Labels only show for part of polygonsRemoving duplicate point labels in QGISLabeling every feature using QGIS?Show labels for point features outside map canvasAbbreviate Road Labels in QGIS only when requiredExporting map from composer in QGIS - text labels have moved in output?How to make sure labels in qgis turn up in layout map?Writing label expression with ArcMap and If then Statement?

            Nuuk Indholdsfortegnelse Etyomologi | Historie | Geografi | Transport og infrastruktur | Politik og administration | Uddannelsesinstitutioner | Kultur | Venskabsbyer | Noter | Eksterne henvisninger | Se også | Navigationsmenuwww.sermersooq.gl64°10′N 51°45′V / 64.167°N 51.750°V / 64.167; -51.75064°10′N 51°45′V / 64.167°N 51.750°V / 64.167; -51.750DMI - KlimanormalerSalmonsen, s. 850Grønlands Naturinstitut undersøger rensdyr i Akia og Maniitsoq foråret 2008Grønlands NaturinstitutNy vej til Qinngorput indviet i dagAntallet af biler i Nuuk må begrænsesNy taxacentral mødt med demonstrationKøreplan. Rute 1, 2 og 3SnescootersporNuukNord er for storSkoler i Kommuneqarfik SermersooqAtuarfik Samuel KleinschmidtKangillinguit AtuarfiatNuussuup AtuarfiaNuuk Internationale FriskoleIlinniarfissuaq, Grønlands SeminariumLedelseÅrsberetning for 2008Kunst og arkitekturÅrsberetning for 2008Julie om naturenNuuk KunstmuseumSilamiutGrønlands Nationalmuseum og ArkivStatistisk ÅrbogGrønlands LandsbibliotekStore koncerter på stribeVandhund nummer 1.000.000Kommuneqarfik Sermersooq – MalikForsidenVenskabsbyerLyngby-Taarbæk i GrønlandArctic Business NetworkWinter Cities 2008 i NuukDagligt opdaterede satellitbilleder fra NuukområdetKommuneqarfik Sermersooqs hjemmesideTurist i NuukGrønlands Statistiks databankGrønlands Hjemmestyres valgresultaterrrWorldCat124325457671310-5