Efficient Algorithms for Destroyed Document ReconstructionStereo images rectification and disparity: which algorithms?Which algorithms are usable for heatmaps and what are their pros and consEfficient flood filling (seed filling)How to test image segmentation algorithms?Viewing images that compressed using lossless algorithmsSources on dictionary learning and related algorithmsHigh Dimensional Spaces for ImagesImage registration algorithms for images with varying distancesAlgorithms to correct misspelled word?More Efficient Feature Method Than Haar-Feature For Face Detection

How do I write real-world stories separate from my country of origin?

Proto-Indo-European (PIE) words with IPA

Are there historical examples of audiences drawn to a work that was "so bad it's good"?

Is there a solution to paying high fees when opening and closing lightning channels once we hit a fee only market?

Does science define life as "beginning at conception"?

Does attacking (or having a rider attack) cancel Charge/Pounce-like abilities?

amsmath: How can I use the equation numbering and label manually and anywhere?

Gas chromatography flame ionization detector (FID) - why hydrogen gas?

A nasty indefinite integral

Can a UK national work as a paid shop assistant in the USA?

Why did Nick Fury not hesitate in blowing up the plane he thought was carrying a nuke?

One word for 'the thing that attracts me'?

How to create razor wire

Find this Unique UVC Palindrome ( ignoring signs and decimal) from Given Fractional Relationship

Surface of the 3x3x3 cube as a graph

Adobe Illustrator: How can I change the profile of a dashed stroke?

Caught with my phone during an exam

Can someone get a spouse off a deed that never lived together and was incarcerated?

Why is this python script running in background consuming 100 % CPU?

Keeping the dodos out of the field

Must every right-inverse of a linear transformation be a linear transformation?

Are clauses with "который" restrictive or non-restrictive by default?

Make the `diff` command look only for differences from a specified range of lines

How could the B-29 bomber back up under its own power?



Efficient Algorithms for Destroyed Document Reconstruction


Stereo images rectification and disparity: which algorithms?Which algorithms are usable for heatmaps and what are their pros and consEfficient flood filling (seed filling)How to test image segmentation algorithms?Viewing images that compressed using lossless algorithmsSources on dictionary learning and related algorithmsHigh Dimensional Spaces for ImagesImage registration algorithms for images with varying distancesAlgorithms to correct misspelled word?More Efficient Feature Method Than Haar-Feature For Face Detection













1












$begingroup$


I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.



I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.



Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.










share|cite|improve this question









New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$







  • 1




    $begingroup$
    Can you edit your question to define "destroyed documents"?
    $endgroup$
    – lox
    2 hours ago






  • 1




    $begingroup$
    You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
    $endgroup$
    – David Richerby
    2 hours ago















1












$begingroup$


I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.



I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.



Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.










share|cite|improve this question









New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$







  • 1




    $begingroup$
    Can you edit your question to define "destroyed documents"?
    $endgroup$
    – lox
    2 hours ago






  • 1




    $begingroup$
    You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
    $endgroup$
    – David Richerby
    2 hours ago













1












1








1





$begingroup$


I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.



I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.



Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.










share|cite|improve this question









New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.



I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.



Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.







image-processing






share|cite|improve this question









New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|cite|improve this question









New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|cite|improve this question




share|cite|improve this question








edited 2 hours ago







Shogun













New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 3 hours ago









ShogunShogun

1085




1085




New contributor



Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Shogun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









  • 1




    $begingroup$
    Can you edit your question to define "destroyed documents"?
    $endgroup$
    – lox
    2 hours ago






  • 1




    $begingroup$
    You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
    $endgroup$
    – David Richerby
    2 hours ago












  • 1




    $begingroup$
    Can you edit your question to define "destroyed documents"?
    $endgroup$
    – lox
    2 hours ago






  • 1




    $begingroup$
    You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
    $endgroup$
    – David Richerby
    2 hours ago







1




1




$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago




$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago




1




1




$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago




$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.



For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.



The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.



To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.



By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
    $endgroup$
    – Alexander
    14 mins ago











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "419"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






Shogun is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f109567%2fefficient-algorithms-for-destroyed-document-reconstruction%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.



For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.



The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.



To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.



By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
    $endgroup$
    – Alexander
    14 mins ago















2












$begingroup$

Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.



For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.



The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.



To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.



By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
    $endgroup$
    – Alexander
    14 mins ago













2












2








2





$begingroup$

Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.



For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.



The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.



To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.



By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.






share|cite|improve this answer









$endgroup$



Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.



For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.



The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.



To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.



By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered 2 hours ago









EvilEvil

8,47742447




8,47742447











  • $begingroup$
    There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
    $endgroup$
    – Alexander
    14 mins ago
















  • $begingroup$
    There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
    $endgroup$
    – Alexander
    14 mins ago















$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
$endgroup$
– Alexander
14 mins ago




$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be O(n!)
$endgroup$
– Alexander
14 mins ago










Shogun is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















Shogun is a new contributor. Be nice, and check out our Code of Conduct.












Shogun is a new contributor. Be nice, and check out our Code of Conduct.











Shogun is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Computer Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f109567%2fefficient-algorithms-for-destroyed-document-reconstruction%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Log på Navigationsmenu

Wonderful Copenhagen (sang) Eksterne henvisninger | NavigationsmenurSide på frankloesser.comWonderful Copenhagen

Detroit Tigers Spis treści Historia | Skład zespołu | Sukcesy | Członkowie Baseball Hall of Fame | Zastrzeżone numery | Przypisy | Menu nawigacyjneEncyclopedia of Detroit - Detroit TigersTigers Stadium, Detroit, MITigers Timeline 1900sDetroit Tigers Team History & EncyclopediaTigers Timeline 1910s1935 World Series1945 World Series1945 World Series1984 World SeriesComerica Park, Detroit, MI2006 World Series2012 World SeriesDetroit Tigers 40-Man RosterDetroit Tigers Coaching StaffTigers Hall of FamersTigers Retired Numberse