dwg2csv is slow

Drop in here to discuss whatever you want.

Moderator: andrew

Forum rules

Always indicate your operating system and QCAD version.

Attach drawing files and screenshots.

Post one question per topic.

Post Reply
alex-b
Registered Member
Posts: 2
Joined: Thu Nov 21, 2024 12:22 am

dwg2csv is slow

Post by alex-b » Thu Nov 21, 2024 12:42 am

I have a DWF file (can't post it, contains a lot of PII) that contains about 30k entities and it takes about 5 minutes for dwg2csv to finish. I noticed that 1 of my CPU cores goes to 100% during this time, and the disk is barely utilized. I also noticed a lot of empty rows (just commas). If I run it with:

Code: Select all

dwg2csv -platform offscreen -t Line -t Polyline -p 'Start Point:X' -p 'Start Point:Y' -p 'End Point:X' -p 'End Point:Y' ./file.dwf
the time drops to 2 minutes, and I still see lots of empty rows. Maybe related to this is also the fact that dwg2svg is slow as well, so I'm guessing it's something that related to fetching all the entities.

I'm happy to help if possible, but don't know where to look. I've checked the github repo (https://github.com/qcad/qcad) but couldn't find anything about dwg2csv (I've seen in other posts that this is a proprietary add-on)

Is this just a limitation of the trial version?

OS: ubuntu 24 - linux/amd64 (docker)
QCAD version: qcad-3.31.2-trial-linux-x86_64

CVH
Premier Member
Posts: 4872
Joined: Wed Sep 27, 2017 4:17 pm

Re: dwg2csv is slow

Post by CVH » Thu Nov 21, 2024 7:50 am

Hi, and welcome to the QCAD forum.
alex-b wrote:
Thu Nov 21, 2024 12:42 am
Is this just a limitation of the trial version?
No, QCAD trial is a fully functional QCAD Pro version with other limitations. (Sessions, session time)
alex-b wrote:
Thu Nov 21, 2024 12:42 am
I also noticed a lot of empty rows (just commas).
Or empty entries regarding the correlating property listed in the top row.
Without a -p switch it defaults to all properties of given object type(s).
And without a -t switch it defaults to all object types.
alex-b wrote:
Thu Nov 21, 2024 12:42 am
so I'm guessing it's something that related to fetching all the entities.
Or the file format, or the content.
From the little information it can be ten of thousands polylines with a million vertices. :wink:
How long does it take to load the file with the QCAD GUI?

For the rest it is a serial approach: Object per object, property per property.
Because it is geared to all types and all properties, the output is filtered as 'Not in the required list'
alex-b wrote:
Thu Nov 21, 2024 12:42 am
the time drops to 2 minutes, and I still see lots of empty rows.
Surely a small snippet of the export would not harm the private nature. :wink:
Guessing: Polyline vertices?

Regards,
CVH

alex-b
Registered Member
Posts: 2
Joined: Thu Nov 21, 2024 12:22 am

Re: dwg2csv is slow

Post by alex-b » Thu Nov 21, 2024 11:29 am

Hey @CVH
How long does it take to load the file with the QCAD GUI?
It's instant, less than 1 second. Zooming + moving around is instant as well, no lag.
From the little information it can be ten of thousands polylines with a million vertices. :wink:
It's a house floor plan (1 level, 2D) with some text (names, addresses, etc, hence the PII). Everything fits on an A4 page. Eyeballing it I can see about 30-40 shapes (squares, rectangles, etc), so I'm guessing most of the lines/polylines are coming from the text
Surely a small snippet of the export would not harm the private nature. :wink:
So from what I can see, the CSV contains: 37k total rows, 20k empty rows (just commas, nothing else), 3k Polyline rows, 15k Line rows, 700 Point rows, 400 Text rows, 100 Circle rows, 100 Arc rows, 175 Hatch rows.

Here's a random snippet from the CSV:

Code: Select all

Type,Handle,Block,Layer,Linetype,Linetype Scale,Lineweight,Color,Displayed Color,Draw Order,Style:Overall scale,Style:Linear measurement factor,Style:Text height,Style:Dimension line gap,Style:Arrow size,Style:Dimension line increment,Style:Extension line extension,Style:Extension line offset,Style:Text position vertical,Style:Text horizontal,Style:Tick size,Style:Linear format,Style:Decimal places,Style:Decimal separator,Style:Zero suppression,Style:Angular format,Style:Angular decimal places,Style:Angular zero suppression,Style:Architectur tick,Style:Text color,Style:Arrow block,Style:Arrow block 1,Style:Arrow block 2,Center:X,Center:Y,Center:Z,Middle:X,Middle:Y,Middle:Z,Radius,Start Angle,End Angle,Reversed,Diameter,Length,Sweep Angle,Area,Referenced Block,Position:X,Position:Y,Position:Z,Scale:X,Scale:Y,Scale:Z,Angle,Columns,Rows,Column Spacing,Row Spacing,Circumference,Major Point:X,Major Point:Y,Major Point:Z,Ratio,Start Parameter,End Parameter,Start Point:X,Start Point:Y,Start Point:Z,End Point:X,End Point:Y,End Point:Z,Middle Point:X,Middle Point:Y,Middle Point:Z,Vertex:X,Vertex:Y,Vertex:Z,Vertex:Bulge,Size:Base Angle,Size:Size 1,Size:Size 2,Simple,Text Position:X,Text Position:Y,Text Position:Z,Text,Plain Text,Font Name,Text Height,Text Width,Text Angle,X Scale,Bold,Italic,Line Spacing,Alignment:Horizontal,Alignment:Vertical,Backward,Upside Down,Name,Origin:X,Origin:Y,Origin:Z,Off,Frozen,Locked,Collapsed,Plottable,Snappable,Off is Freeze,Tab Order,Min Limits:X,Min Limits:Y,Min Limits:Z,Max Limits:X,Max Limits:Y,Max Limits:Z,Insertion Base:X,Insertion Base:Y,Insertion Base:Z,Min Extents:X,Min Extents:Y,Min Extents:Z,Max Extents:X,Max Extents:Y,Max Extents:Z,Plot Margins:Left,Plot Margins:Bottom,Plot Margins:Right,Plot Margins:Top,Plot Paper Size:Width,Plot Paper Size:Height,Plot Origin:X,Plot Origin:Y,Plot Window Area Min:X,Plot Window Area Min:Y,Plot Window Area Max:X,Plot Window Area Max:Y,Custom Scale:Numerator,Custom Scale:Denominator,Plot Paper Units,Plot Rotation,Plot Type,Use Standard Scale,Standard Scale,Standard Scale Type,Media Name,Description,Metric,Pattern,Hidden,Pixel Unit,Layout,Polyline Pattern,Closed,Vertex:Angle,Vertex:Start Width,Vertex:End Width,Global Width,Orientation,Global Z,Solid,Alpha,Pattern:Name,Pattern:From Entity,Pattern:Angle,Pattern:Scale,Id

,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Line,0x220f,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5490,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.207283748412,,,,,,,,,,4.712388980385,,,,,,,,,,,,1255.675740520703,344.008468925418,0.000000000072,1255.675740520703,343.801185177006,0.000000000072,1255.675740520703,343.904827051212,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5550
Line,0x2210,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5491,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.228913356942,,,,,,,,,,4.712388980385,,,,,,,,,,,,1255.684752857545,344.019283729683,0.000000000072,1255.684752857545,343.790370372741,0.000000000072,1255.684752857545,343.904827051212,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5551
Line,0x2211,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5492,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.049567852868,,,,,,,,,,0,,,,,,,,,,,,1255.49819748383,344.019283729683,0.000000000072,1255.547765336698,344.019283729683,0.000000000072,1255.522981410264,344.019283729683,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5552
Line,0x2212,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5493,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.04866661923,,,,,,,,,,0,,,,,,,,,,,,1255.665826949989,344.019283729683,0.000000000072,1255.714493569219,344.019283729683,0.000000000072,1255.690160259604,344.019283729683,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5553
Line,0x2213,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5494,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.059481423581,,,,,,,,,,0,,,,,,,,,,,,1255.49819748383,343.790370372741,0.000000000072,1255.557678907411,343.790370372741,0.000000000072,1255.527938195621,343.790370372741,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5554
Line,0x2214,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5495,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.078407330904,,,,,,,,,,0,,,,,,,,,,,,1255.636086238315,343.790370372741,0.000000000072,1255.714493569219,343.790370372741,0.000000000072,1255.675289903767,343.790370372741,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5555
Line,0x2215,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5496,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.022584851543,,,,,,,,,,5.783838586374,,,,,,,,,,,,1255.508111054543,344.019283729683,0.000000000072,1255.527938195737,344.008468925418,0.000000000072,1255.51802462514,344.01387632755,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5556
Line,0x2216,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5497,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.023793252086,,,,,,,,,,4.282626704948,,,,,,,,,,,,1255.694666428026,344.019283729683,0.000000000072,1255.684752857545,343.997654121153,0.000000000072,1255.689709642786,344.008468925418,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5557
Line,0x2217,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5498,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.022584851543,,,,,,,,,,3.640939374395,,,,,,,,,,,,1255.704579998739,344.019283729683,0.000000000072,1255.684752857545,344.008468925418,0.000000000072,1255.694666428142,344.01387632755,0.000000000072,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,5558
Polyline,0x2218,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,5499,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.045169702881,,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,false,false,,,,0,2,0,,,,,,,5559
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Text,0x2794,Viewport1,0,Continuous,1,0,#7f6f3f,#7f6f3f,6901,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,false,1227.997051471844,381.946802286944,0.000000000072,THERMAL INSULATION,THERMAL INSULATION,Swis721 LtCn BT,0.20818497031,0,0,1,false,false,1,0,3,false,false,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,6961
This is pretty much how it looks like, a lot of empty lines -> some Line rows -> 1 Polyline after all the Line rows. Lots of Polyline rows look isolated as well, with just empty rows above and below. I've also added a Text row in case there's something that matters.

I'll provide more info if necessary :)


Side note, I only need to extract the lines/polylines (segments), I'm not interested in the text or the lines/polylines of the text, is there a way to ignore them? I know I can select Lines and Polylines (you can see that in the command I ran), but that still selects the lines and polylines forming the texts.

CVH
Premier Member
Posts: 4872
Joined: Wed Sep 27, 2017 4:17 pm

Re: dwg2csv is slow

Post by CVH » Thu Nov 21, 2024 2:50 pm

Hi,

If you ran the command as in your first post then types should be filtered on ''Line" and "Polyline".
:arrow: Can't explain why a Text entity would be listed at all ...
:arrow: Can't explain the lines with nothing but commas ... No 'Type', no 'Handle',,,,, no 'Id' is totally nothing.

The export contains a Property column for 169 different properties.
:arrow: Meaning: NOT filtered on the 4 given property switches ...
What if you use double quotes?
alex-b wrote:
Thu Nov 21, 2024 11:29 am
I'm not interested in the text or the lines/polylines of the text, is there a way to ignore them?
The last line in your snippet is about a Text entity, the plain text is 'THERMAL INSULATION', font is 'Swis721 LtCn' BT, 0.208 high ...
Text is text, end of line.
QCAD can not distinguish the difference between a Polyline and Polylines/Lines/Arcs from an explosion of a Text entity.
There is no property in the nature of 'I once was Text'. :lol:

Still, a Polyline 0.045 long with a zero area ... More something as a single straight segment.
Z coordinates not zero ...

I think I want to see the file in question.
You may always contact me per PM (Private Message)
What you send will be treated with the utmost confidentiality, I am not interested in what it represent.
Easiest route is to click on my user name and take it from there.

Regards,
CVH

CVH
Premier Member
Posts: 4872
Joined: Wed Sep 27, 2017 4:17 pm

Re: dwg2csv is slow

Post by CVH » Sat Nov 23, 2024 12:17 pm

Hi,
Thanks for the file per PM.
It was very interesting to see how bad a drawing file can be. :lol:

:!: I would not stick to the DWF format.
Isolated several things (>91%) on dedicated layers but then it seems that additional layers are not stored.
Saving in this format also changed some color attributes.
But the main issue was that all information on dedicated layers was lost by a single click. :x

First action should be: Flatten to 2D. I had some problems with selecting things because of 3D content (Z not really zero).
The drawing has unit 'None' and is most probably conceived in meters.
Everything may indeed fit on an A4 page, any drawing can, the question is in what scale and whether that is representative.
1:100 means that every cm on paper is a meter, and that can't be said when printed on A4. :wink:

Further, the setup of the file is very unconventionally and this is perhaps on purpose.
Protecting the art work of the engineering cooperation. :roll:

There is but one single entity in Model_Space, a Block Reference based on Block 'Viewport1'.
As stated above, there is but one layer, the mandatory layer '0'.
Also meaning that all entity attributes like Color are custom per entity.

:!: Linetypes are unused, dashed lines are drawn as individual segments.
-> As if the Linetype is exploded what is not an option in QCAD.
Not problematic, lines are segmented, their individual length property has no functional meaning.
But there are 2 very bad examples:
At about (1198.04, 385.044) there is a 'dashed line' 1.75 long, 180 degrees, made up of 441 line segments (Part of 'Handrail').
Another starting at (1196.29, 375.184) zero degrees, made up of 404 line segments.

The thick outer border lines are no lines, it are solid Hatches.

:!: Beside solid Hatches, all patterned Hatches are as if exploded to line-art.
-> For example: Pattern ZIGZAG as 1778 lines for the 'Swimming Pool' each 0.12527 long unless clipped.
It also seems that the source application was not able to define hatched areas with curved boundaries.
The 2 copies of the (fancy) North symbol account for 146 of the 175 solid Hatch entities or 83%.
-> As if interpolated by filled polygon areas, some overlap.

:!: There are no Dimension entities in this drawing, all is line-art (Using the architectural tick) + Text entities.
Extensions lines are dashed, typically using color #989898.
Dimension lines and ticks are continuous, typically using color #7f00ff.
Ticks are not all at 45-135-225-315 degrees, some angles vary, not really centered and so on.
Level markers are line-art, a circle and 2 lines instead of a block, and typically using color #ffbf00.
A Leader is a Polyline and a solid Hatch as arrow (Without its boundary).
-> If we explode a Leader in QCAD we get a Polyline and a Solid, exploding the Solid results in the outline.

And the biggest issue:
Beside that there are 459 Text entities there is a lot of text original as 'RomanT' but exploded to line-art.
That itself accounts for 11050 entities of the 16696 in total or 66%.


Even with the CSV export of your custom script it will be a forest of line segments where you can not diversify one from the others.
More on your custom script per PM.
And yes, dwg2csv seems to be slow for 16696 entities and 169 possible properties.
Even skipping requires some processing time for about 3 million entries or much more including the vertices.

Quite convinced that the empty lines with commas are polyline vertices and then empty because those properties where not required.
In that case I would not list an 'empty' line.

Regards,
CVH

Post Reply

Return to “Chat”